Small epiphany about the Shavian website
Oct. 4th, 2010 11:15 amThe main purpose of the Shavian wiki is to build a lexicon. But when I started the site, it also allowed you to upload documents in the Latin alphabet, and it would transliterate them on the fly. It got to be quite clever, allowing you to add unknown words to the lexicon inline, and disambiguate homonyms.
However, the transliteration was done with a MediaWiki extension (called George), and therefore was written in PHP. It needed to build a fairly large cache of transliterations in memory, and because PHP runs in-process in the webserver, this resulted in the process taking up far too much memory. So I turned the transliteration off.
The sources of these documents used wiki markup. About the same time, I started building a set of typesetting tools which used DocBook markup. This was much more flexible. But then there were two sets of incompatible source documents, and the tools which were available to the wiki documents were not available to the DocBook documents.
It occurred to me a while ago that an equivalent CGI script would be just as good as the MediaWiki extension, and would return the memory once it was done. It occurred to me today that I should take this opportunity to stop using wiki markup and use DocBook for everything. I could easily write a translator for the documents which already exist. And writing the word-adding and disambiguating tools as CGIs would mean a lot more flexibility.
I really like this idea. (I won't be doing it just yet, because I'm busy, but it's certainly something worth thinking about.)
However, the transliteration was done with a MediaWiki extension (called George), and therefore was written in PHP. It needed to build a fairly large cache of transliterations in memory, and because PHP runs in-process in the webserver, this resulted in the process taking up far too much memory. So I turned the transliteration off.
The sources of these documents used wiki markup. About the same time, I started building a set of typesetting tools which used DocBook markup. This was much more flexible. But then there were two sets of incompatible source documents, and the tools which were available to the wiki documents were not available to the DocBook documents.
It occurred to me a while ago that an equivalent CGI script would be just as good as the MediaWiki extension, and would return the memory once it was done. It occurred to me today that I should take this opportunity to stop using wiki markup and use DocBook for everything. I could easily write a translator for the documents which already exist. And writing the word-adding and disambiguating tools as CGIs would mean a lot more flexibility.
I really like this idea. (I won't be doing it just yet, because I'm busy, but it's certainly something worth thinking about.)