[ajug-members] Geek Challenge
aaron
aaron at pd.org
Sat Oct 22 21:06:18 EDT 2005
See:
Fluid Concepts and Creative Analogies
-- Computer Models of the Fundamental
Mechanisms of Thought
by Douglas Hofstadter** and the
Fluid Analogies Research Group
ISBN 0-465-02475-0 (paperback)
copyright 1995, BasicBooks
One of the artificial intelligence challenges explored in the book was the
groups efforts at devising a non-dictionary method of solving "MEBLUJ" type
word puzzles. There are some correlations to your sentence splitting
challenge which may offer ideas for novel methods (OOP lingo pun intended).
Rather than use a brute force search of a dictionary against each possible
ordering for the full set of letters, the approach to "BUJMEL" solutions
taken by the Fluid Analogies group involved breaking the letters into pairs
and then scoring the pairs against a short list of 2 letter combinations. The
2 letter combinations list was (crudely) weighted to the likelyhood of an
ordered pair appearing in an english word, so "JU" would likely score higher
than "UJ" and "LE" would show as more likely than "ML". The letters were then
assembled in the order that produced best possible combination of pairing
scores.
When I read the book I was intrigued by the elegant simplicity of this
approach. I had always enjoyed word puzzles, but this got me in the habit of
solving the daily comics page Jumble in my head.
With the challenge of properly inserting spaces into colapsed sentences, one
approach might be to test 1, 2, 3 and 4 letter fragments against a short,
weighted list of common articles as the most likely word break points, eg. I,
A, AN, IS, IT, HE, OF, THE, FOR, SHE, THEN, THEY, THEM, THAN... etc. From
there, one could confirm the longer fragments and words with a relatively
small, secondary list of the most likely letter sequences.
Of course, if you want to obfuscate and complicate the problems, you can
always pursue a true neural net solution. ;-)
peace
aaron
**BTW, if you are unfamiliar with Hofstadter, he has done a lot of ground
breaking work in the computer specialty of A.I. and has published a number of
treatises on the topics. His most well known work, a Nobel prize winner and
a book that is perhaps the most entertaining and mentally stimulating
technical treatise that I have yet read, is:
"Escher, Godel, Bach: The Eternal Golden Braid"
I can highly recommend this book to all fans of graphic arts, mathematics,
music composition, computers or the sciences (in any weighted subset
combination of preference).
On Friday 21 October 2005 19:28, Dan Glauser wrote:
> Are there any computer scientists in the house?
> Mathematicians? Hard core geeks? I know you are out
> there.....
>
> I'm looking to discover, devise, or some how come up
> with an algorithm to find ideal full decomposition of
> a non delimited string into words. Some simple
> examples:
>
> ariverrunsthroughit => a river runs through it
>
> cookinguniversity => cooking university
>
> p1ecesofpie =? ?????? of pie => no match
>
> 8675rarara =? ???? ra ra ra => no match, must
> completely break down the string into words
>
> Some more interesting examples:
>
> booksugar => book sugar
> This is interesting because we don't want to make the
> mistake of:
>
> booksugar =? books ???? => no match
>
> This is a common issue with the algorithms I've been
> trying so far.
>
> goodstart =? good start || goods tart
> Which is it?
> Please note for the problem I am trying to solve it is
> more important to get the total number of words that
> fully fit the string so in this case it doesn't as
> much matter which strings are picked as long as the
> ones picked cover the entire string.
>
> Make sense?
>
> There are tons of more difficult examples that I've
> been feeding into different algorithms, I hesitate to
> cloud the AJUG list with them. If you like this type
> of challenge and/or are good with computer
> science/math then I would appreciate taking the
> discussion offline hearing your input. Once a
> solution is found I'd be happy to post the results to
> the list. This probably mirrors a classic CS problem
> used to solve xxxx, I'm just not seeing it and would
> like another set of eyes and someone to discuss
> different approaches with.
>
> If your help leads to a solution then I see a free
> dinner in your future......
>
> Come on, algorithms are fun!
>
> :)
>
> --
> Dan
> danglauser at yahoo.com
> _______________________________________________
> ajug-members mailing list
> ajug-members at ajug.org
> http://www.ajug.org/mailman/listinfo/ajug-members
>
>
More information about the ajug-members
mailing list