The Joel on Software Discussion Group (CLOSED)

A place to discuss Joel on Software. Now closed.

This community works best when people use their real names. Please register for a free account.

Other Groups:
Joel on Software
Business of Software
Design of Software (CLOSED)
.NET Questions (CLOSED)
TechInterview.org
CityDesk
FogBugz
Fog Creek Copilot


The Old Forum


Your hosts:
Albert D. Kallal
Li-Fan Chen
Stephen Jones

Google Suggest for a dictionary

Hi guys,

I just finished implementing Google suggest for a dictionary database.

http://www.objectgraph.com/dictionary

The code is clean and you could see it by using "View Source"

The dictionary database is on an SQL server (total of 18000+ words) with an index on the word column.
Gavi Narra Send private email
Friday, December 24, 2004
 
 
beautiful :)
Li-fan Chen Send private email
Friday, December 24, 2004
 
 
Nicely done!
Joel Spolsky Send private email
Friday, December 24, 2004
 
 
A fantastic way to cheat at Scrabble if ever there were any!
Ryan Phelps Send private email
Friday, December 24, 2004
 
 
If you type in a string not in the dictionary, you get this error:

"There was a problem retrieving data:
undefined"
Buggletty Adams
Friday, December 24, 2004
 
 
Hey Gavi, how did you setup the backend? What kind of dictionary source is it and how did you clean it up for db use? Can you show the web service aspx source feeding the div?
Li-fan Chen Send private email
Saturday, December 25, 2004
 
 
The dictionary he's using looks like Webster 1913 to me.
Robert Plank Send private email
Saturday, December 25, 2004
 
 
hmmmm.... interesting meaning for the word "rocket"
ace
Saturday, December 25, 2004
 
 
That is a legitimate definition, just not the first one you think of. But, the first alphabetically.
www.MarkTAW.com Send private email
Saturday, December 25, 2004
 
 
I just added a "how does it work?" section. This should clear up some of the questions and yes, the dictionary is old.
Gavi Narra Send private email
Saturday, December 25, 2004
 
 
Really impressive. Get the site finished, get Scoble to link to the dictionary and 2005's looking pretty rosy for you. :)
Thom Lawrence Send private email
Saturday, December 25, 2004
 
 
Hey Gavi,

If it's okay with you I was working on a PHP article a few days ago on how to clone Google Suggest, also saw the doc on Apple's Developer site and decided to use your cool dictionary example.

Link: http://www.jumpx.com/tutorials/googlesuggest

I mentioned your name and linked to that demo, I can link to a main site of yours too if you can give me a URL.

As I said before I started this a few days ago and wanted to show people how to go through every step on this, from extracting the dictionary to using an IFRAME to finally using XMLHttpRequest.  I was impressed with that idea so much I just had to show people how to do it.
Robert Plank Send private email
Saturday, December 25, 2004
 
 
Pretty cool.

Your 'how it works' says "but a connection is being opened for every keystroke"". Are you sure? I think at least in IE, that it will recycle connections as appropriate. But I may be wrong.
mb Send private email
Saturday, December 25, 2004
 
 
> I just added a "how does it work?" section.

Gavi that's just awesome, keep up the excellent work!! :-)
Li-fan Chen Send private email
Saturday, December 25, 2004
 
 
though a quick test with netstat looks good (2 connections are created and then maintained). but i didn't really monitor it (e.g. with netmon). explains how it can run so fast, too.
mb Send private email
Saturday, December 25, 2004
 
 
> Are you sure? I think at least in IE, that it will recycle connections as appropriate. But I may be wrong.

Hmm, I just messed around with it and it looks like it depends on your temporary internet file settings.  If you have it set to never check for new pages, the XML request will be cached the same as loading pages "the normal way."
Robert Plank Send private email
Saturday, December 25, 2004
 
 
The dictionary itself (from 1913) provide an entertainment value all by itself. For example, check out the definition of "computer".
Oren Send private email
Saturday, December 25, 2004
 
 
Nice done! Can you implement cross browser support too?
Peter Monsson Send private email
Saturday, December 25, 2004
 
 
>Nice done! Can you implement cross browser support too?

Looks like it is supported in Mozilla 1.0+, IE 5.0+ and Safari 1.2

I dont know about other ones( like Konqueror and Opera)
Gavi Narra Send private email
Saturday, December 25, 2004
 
 
Even Google Suggest doesn't work in Opera.  If you wanted to be psychotic about it you could have it resort to iframes if the XMLHTTP/XMLHttpRequest etc doesn't work.
Robert Plank Send private email
Saturday, December 25, 2004
 
 
Works well with Opera 7.6 preview.  Very nice job.
gogogadgetscott Send private email
Saturday, December 25, 2004
 
 
Funny. I've already started using Google Suggest to check spelling, and wondered how long before someone does a good dictionary that works like that. This rules.
Mystran
Sunday, December 26, 2004
 
 
Robert,

Have you tested your "Google Suggest" demo using Firefox 1.0? It works at my end using IE but not Firefox.
John Topley Send private email
Sunday, December 26, 2004
 
 
It's even working with Firebird 0.6, FWIW.
Pakter
Sunday, December 26, 2004
 
 
I've got it working with Firefox 1.0
Ankur
Sunday, December 26, 2004
 
 
I see that Gavi mention
"The response from the database is fast for the time being , but a connection is being opened for every keystroke and is not a good idea"

What are the good workarounds/alternatives do you guys suggest?
Anon Send private email
Sunday, December 26, 2004
 
 
Just a couple of thoughts...

You might assume that the bandwidth usage for a page designed this way would be much higher than a regular page, but I wouldn't be surprised if the bandwidth savings by only transmitting the part of the UI that changed almost make up for the extra hits.

The request might be triggered by a client-side timer set at say 500 ms after key-up, rather than the key-up event to save unnecessary roundtrips.

As cool as this dynamic behavior is, did anyone stop to think about why the usability of this dictionary is better than it would have been if the user was required to hit Enter after typing the query text? I fail to see the big improvement in this particular application, but as a proof of concept it might help show the way toward more web-based applications where dynamic, partial updates are more relevant.

It is way too much work to implement this behavior in a browser compared to traditional fat clients, in my opinion. Frameworks and components need to encapsulate this behavior. Then this technology will really start to take off.

This only works on Windows, because it relies on an ActiveX component, isn't that right?

I noticed the query supports wildcard searches using the %-sign. This is a cool feature, but defeats the index. It also appears to me (from looking at the "how does it work" page) that there is a SQL injection vulnerability, although I don't have the desire or patience to try to exploit or prove it.
Big B Send private email
Sunday, December 26, 2004
 
 
If you look at the code, it first tries to use ActiveX but if that doesn't work defaults to XMLHttpRequest.  So it isn't a Windows-only thing.

Since the suggestions show up right away, you don't have to try spelling a word, hit Enter, hit back, clicking back on the text box, retyping, etc.  You can hit backspace and try a bunch of words in a couple of seconds... versus 30 seconds to a minute doing it the other way with lots of repetition.

Neither Gavi's nor my code removes an extra % if there is one, but both our implementations protect against SQL injection attacks.  Look at Gavi's code where he replaces ' with ''.  As for the % a bug is just an undocumented feature right?  ;-)

I think the issue isn't so much bandwidth as it is resources... let's say you had 10 users on your site at any given time, all typing in a 10-letter word.  That's 100 separate queries, 100 separate connections, all at the same time.

I think the 500 ms delay thing would almost defeat the purpose of this though.

Sorry, I didn't know it worked in Opera 7.6, I'm still on 6.02 and when I type keys in I see something loading in the status bar but suggestions never show up.
Robert Plank Send private email
Sunday, December 26, 2004
 
 
As I mentioned, it's probably NOT multiple HTTP connections, just multiple HTTP queries over one pipelined connection, which are usually quite fast. Of course, database connection pooling is an issue in your server-side code.
mb Send private email
Sunday, December 26, 2004
 
 
Thank you all for your comments. Yes, I agree about the connection pooling thing, I still think if you are doing this for only one thing, like lets say google suggest, then it would be wise to write a custom webserver and load all data as a searchable data structure in memory(The whole dictionary in plain text -all 180000+ words- is ony 5 meg), So it will definitely help improve the speed and should be very scalable, but lets say you would like to create an AutoFill ASP.NET control that could be incorportated in webapplications by drag and drop, we would definitely have to consider having a standard RDBMS as a backend and thus connection pooling and indexes will play an important role.

Keep checking the website, i will load other free dictionaries like dictionary of computing and dictionary of medicine and will also implement the custom webserver and ASP.NET component(and ofcourse i will post all the source code)
Gavi Narra Send private email
Sunday, December 26, 2004
 
 
asp.net should manage much of your database connection pooling for you.

aren't abstractios nice?
mb Send private email
Monday, December 27, 2004
 
 
Here's another implementation - a zip code lookup....

 http://69.51.83.2/finder
lurker Send private email
Monday, December 27, 2004
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz