| ||
|
This community works best when people use their real names. Please
register for a free account.
Other Groups: Joel on Software Business of Software Design of Software (CLOSED) .NET Questions (CLOSED) TechInterview.org CityDesk FogBugz Fog Creek Copilot The Old Forum Your hosts: Albert D. Kallal Li-Fan Chen Stephen Jones |
A simple "pet" project I'm looking at getting underway is a system tray accessible Lorem Ipsum generator that populates the clipboard with a user-definable number of word/paragraphs of Lorem Ipsum text simply by double-clicking the system tray icon. Similar to the excellent http://www.lipsum.com/ but as a desktop app. I've scoured Google looking for an algorithm to generate Lorem Ipsum text. My assumption is that it uses some kind of clever grammar code to string together the correct long and short words to make the sentances look "good" whilst remaining random, however maybe it's dumber than that and picks from a long list of hard-coded sentences? I'd prefer to know how to approach this problem the first way rather than the "pick from pre-entered sentances" which sounds like a poor-man's solution. Anyone got any ideas on how to approach this or point me to somewhere that explains the generation technique in plain English? Thanks, CF
Charlie Farley Thursday, June 30, 2005
It isn't generated, so far as I'm aware, but a slab of text that printers have been using for years without number so you just paste a copy into whatever space you need to show layout.
"It uses a dictionary of over 200 Latin words, combined with a handful of model sentence structures, to generate Lorem Ipsum which looks reasonable. The generated Lorem Ipsum is therefore always free from repetition, injected humour, or non-characteristic words etc. "
If you happen to have followed the diveintopython book (diveintopython.org) example of grammar generator, you'll find out a way to do it yourself. This gave me an idea though. The principle of the engine is simple. Define a xml file which contains grammar informations then use this xml file to generate the grammar. More info here: http://diveintopython.org/xml_processing/index.html#kgp.divein
FWIW, don't use Lorem Ipsum as a test data generator, since it doesn't help in revealing any character encoding gotchas.
Is it the generation algorithm that you're interested in, or simply the application residing in the system tray that will give copy some lipsum text to clipboard upon request? If its simply the application, then why generate the text at all. Just get 500 or so paragraphs from lipsum.com, and store that in a resource file for your application. Have the app spit out the first n words, sentences or paragraphs upon request. If more than 500 paragraphs are needed, repeat from the beginning again - I dont think repetition would be a problem after 500 or so. You could also randomise the order of the paragraphs if required.
redeye Thursday, June 30, 2005
Joel had a string of text he used that had a bunch of foreign characters in it, but that has nothing to do with the question at hand. What's wrong with: Possible random adjective Random noun Possible random adverb Random verb Possible random adjective Random noun Period How much more grammar do you need?
Thanks for all your input. I spoke to the webmast at www.lipsum.com who hinted that the text is generated from a dictionary of words, model sentence structures and I believe is based on syllables, so the sentence "Lorem ipsum dolor sit amet" could be represented as being made of words of 2,2,2,1 and 2 syllables. In theory, similarly we could write "Magna dolor amet a wisi" and because the words have no semantic meaning the shape of the sentence is maintained. By having a collection of syllable maps for varying length sentences and a large dictionary of words, grouped by number of syllables I was able to produce pretty good looking and very random Lorem Ipsum text in a couple of hours this afternoon using *cough* classic ASP. Just me (Sir to you) I appreciate the advice not to use it as a test data generator. It's more to do with aesthetics as I don't like to present application to users for their testing with my "test", "ajkhkjdshsl" and "qwerty" entries! Plus, they all think I'm fluent in Latin when I show them my "Lorem ipsum..." test data ;-) redeye, I was more interested in adding another string to my coding toolbox by figuring out how it was generated. Had it got into tricky grammar rules I'd have given up and gone for picking bits from the hard-coded, pre-stored block of Lorem ipsum but as I've discovered, the results from even simple syllable rules seems pretty good. However, I do really want the functionality of a system tray application to make inserting lorem ipsum into my apps as simple as possible. MarkTAW, I agree I could have copied a block of text from anywhere and used sentences from that. I'll have to put my hand up and admit that I'm a geek and I think Lorem Ipsum just looks cool. 1.3million visitors to www.lipsum.com this year indicates I'm not alone! Thanks again for all your input.
Charlie Farley Thursday, June 30, 2005
Personally I don't really prefer Lorem Ipsum, atleast not when starting with those very words. They are meant to not convey any information and be neutral. But I've seen it so many times so it kinda have a special meaning to me. I much prefer random indonesian text. Mostly because it looks so funny. Also it makes no sense at all to me. Less than Lorem Ipsum text. It can be esaliy found by searching for co.id on Google. How it can look: Omar mengunjungi korban kecelakaan di Rumah Sakit Siaga Raya Pasar Minggu bersama Menteri Negara BUMN Sugiharto. Dalam kunjungan Omar dan Sugiharto sempat berdialog dengan para korban dan keluarganya. Though, it's just a matter of taste to me.
Sebastian Thursday, June 30, 2005
MarkTAW, I'm not clever enough to conjugate and can only just about identify English nouns and verbs! Sebastian, the Indonesian looks cool but knowing my luck I'd end up insulting the Indonesian population when my randomly generated text translates to something rude!
Charlie Farley Thursday, June 30, 2005
MAXIMUM INQUEMENTUM TUM BIGUTTAM EGRESSO SCRIBE. MEO MAXIMO VESTIBULO PERLEGAMENTUM DA. DA DUO TUM MAXIMUM CONSCRIBEMENTA MEIS LISTIS. DUM LISTIS DECAPITAMENTUM DAMENTUM NEXTO FAC SIC NEXTUM TUM NOVUMVERSUM SCRIBE EGRESSO. LISTA SIC HOC RECIDEMENTUM NEXTUM CIS VANNEMENTA DA LISTIS. CIS. This is not shouting, this is Latin. This is the sieve of Eratosthenes in Lingua::Romana::Perligata. Perligata is in my opinion one of the most beautiful esoteric programming languages ever invented. Take any Perl-script, and have it translated to Perligata. There you have your Lorem Ipsum. Should not be too difficult to obtain large amounts of Perl-code as a source for your translator, e.g. from CPAN. Added benefit, Perl is so dense/ has so little redundancy/ is so unintelligible/ is so brittle, that almost any random ascii-text can be interpreted as valid Perl. Thus you can generate perfect geeky Latin with a random-generator and the Perligata-translator. SALUTEM
You mean stuff like: The dopey journalist educated a xenophobic travel agent. The fat virtuous ostrich purlioned the noxious weed. The thrifty bandicoot looked at the noxious weed. The nocturnal fruitbat tripped over the shrubbery. An avaricious unicorn contemplated the shrubbery. The quick emu swallowed the watery soup. A microbe looked at a juniper bush. A well-preserved virtuous hippo ate a kiwi fruit. An emu uselessly conversed with the threadbare hearthrug. The well-preserved little necromancer educated a doormat. A quokka praised an alien artifact. The xenophobic ostrich visited an orange. The rapid rapid fruitbat broke a shrubbery. The ingenious aardvark organized the book. The well-preserved nocturnal programmer purlioned the valuable manuscript.... etc etc. I wrote something that generates random text like this as part of a demo program for simple crypto techniques in http://www.ubercode.com - what is does is have arrays of subjects, verbs, adjectives and objects and randomly pick them to construct sentences. For the purposes of the crypto routines (which were *very* simple) I wanted words starting with each letter of the alphabet. The sentences are syntactically correct but semantically they are nonsense - I was going to improve it by 'typing' the subjects / objects but haven't got round to that yet. If you're interested you're welcome to try it out (it's in the 'crypto.cls' example). | |
Powered by FogBugz
