[ajug-members] Geek Challenge

Joe Sam Shirah jshirah at attglobal.net
Fri Oct 21 18:19:35 EDT 2005


    Hi Dan,

    Due to a large, in-progress project, I don't have the time to jump to
code for your algorithms, but I do have a suggestion.  Let me tell you a
story that illustrates my reasoning.  I have a BBS with emphasis in
Economics and a Master of International Management degree ( actually a
pretty good background for business computing. )  Along the way, I picked up
some ( now rusty ) Russian and German.

    When I first took the language classes, I worked diligently on speaking
slowly and clearly.  I was dismayed to get a low level B after untold hours
of effort.  Then it struck me that native speakers speak quickly and the
words almost run together.  The native listener has the capability to tell
when words begin and end.  I started speaking faster and deliberately
slurring my words.  The native instructors were thrilled, and I got A's from
then on.

    The point is, I would look at speech-to-text algorithms for the shortest
path to answers for your problem.  While those products are not perfect (
partially due to issues like context and emphasis a la Seinfeld ), I think
they face the same problems in a slightly different form and I would expect
most of that work to be applicable.  HTH, and that you didn't mind my
personal illustration.  Best,


                                                         Joe Sam

Joe Sam Shirah -        http://www.conceptgo.com
conceptGO       -        Consulting/Development/Outsourcing
Java Filter Forum:       http://www.ibm.com/developerworks/java/
Just the JDBC FAQs: http://www.jguru.com/faq/JDBC
Going International?    http://www.jguru.com/faq/I18N
Que Java400?            http://www.jguru.com/faq/Java400


----- Original Message ----- 
From: "Dan Glauser" <danglauser at yahoo.com>
To: <ajug-members at ajug.org>
Sent: Friday, October 21, 2005 3:28 PM
Subject: [ajug-members] Geek Challenge


> Are there any computer scientists in the house?
> Mathematicians?  Hard core geeks?  I know you are out
> there.....
>
> I'm looking to discover, devise, or some how come up
> with an algorithm to find ideal full decomposition of
> a non delimited string into words.  Some simple
> examples:
>
> ariverrunsthroughit   =>  a river runs through it
>
> cookinguniversity  =>  cooking university
>
> p1ecesofpie    =?  ?????? of pie =>  no match
>
> 8675rarara   =?  ???? ra ra ra  => no match, must
> completely break down the string into words
>
> Some more interesting examples:
>
> booksugar  =>  book sugar
> This is interesting because we don't want to make the
> mistake of:
>
> booksugar =?  books ????  =>  no match
>
> This is a common issue with the algorithms I've been
> trying so far.
>
> goodstart  =?  good start || goods tart
> Which is it?
> Please note for the problem I am trying to solve it is
> more important to get the total number of words that
> fully fit the string so in this case it doesn't as
> much matter which strings are picked as long as the
> ones picked cover the entire string.
>
> Make sense?
>
> There are tons of more difficult examples that I've
> been feeding into different algorithms, I hesitate to
> cloud the AJUG list with them.  If you like this type
> of challenge and/or are good with computer
> science/math then I would appreciate taking the
> discussion offline hearing your input.  Once a
> solution is found I'd be happy to post the results to
> the list.  This probably mirrors a classic CS problem
> used to solve xxxx, I'm just not seeing it and would
> like another set of eyes and someone to discuss
> different approaches with.
>
> If your help leads to a solution then I see a free
> dinner in your future......
>
> Come on, algorithms are fun!
>
> :)
>
> --
> Dan
> danglauser at yahoo.com




More information about the ajug-members mailing list