An interesting exercise.

So I’m developing this game in my spare time, and I needed a list of commonly used words. Downloaded the Project Guttenberg DVD of popular texts, an open source dictionary of words, and built a simple program which scanned the DVD, parsed out all the words, compared against the dictionary, and updated the count of those words to arrive at a list of commonly used words.

And here’s the top 50 words I found, in order:

the
of
and
to
in
that
he
was
it
his
is
with
for
as
you
on
had
not
be
at
but
by
this
her
or
which
from
have
they
she
all
him
we
are
were
my
me
so
one
an
no
their
if
there
who
said
them
when
would
been

I don’t know if this is cool or… mundane…

One thought on “An interesting exercise.

  1. The next 50 words, by the way, are:

    will
    what
    do
    up
    any
    out
    more
    then
    man
    into
    your
    has
    other
    some
    now
    our
    could
    very
    about
    time
    can
    its
    than
    may
    upon
    project
    only
    like
    little
    these
    such
    see
    work
    two
    us
    did
    great
    well
    should
    made
    before
    must
    after
    over
    good
    men
    first
    how
    day
    know

    I think “project” shows up because I’m not stripping the Project Gutenberg headers.

    Like

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s