marnanel: (Default)
[personal profile] marnanel
Three things:
  1. Who are the audience?  There's the few but determined people who go to the Shavian wiki because they want to build a lexicon.  I know a lot about that audience.  But there are also people who want to read texts in Shavian or one of the other alphabets, or who want to transliterate text into Shavian or one of the other alphabets (even if only for a joke), or to learn about the alphabets.  I'm not sure who or how many these people are, or how best to cater for them, or how best to attract the people who would like to know about the site but don't.  For example, would people be better served having Shavian in one column and conventional spelling in the other, or would they prefer all Shavian all the way through?  Would people like to be able to read Wikipedia in Shavian?  It could easily be done once the transliteration script had been made separate from the wiki.
  2. How do we store disambiguation information?  At the moment we do this inline in the texts.  I would like to find a way of separating the disambiguation data from the text itself, so that we could keep pristine copies of (say) the contents of en.wikisource and just add disambiguation notes to override automatic disambiguation (e.g. "read 'number' as 𐑯𐑳𐑥𐑼, not 𐑯𐑳𐑥𐑚𐑼") in a separate place.  (I'm talking about the back end here, not the user interface.)  We could of course use character or lexeme offsets, but then we should consider what would happen if the text got updated.  One possible option, which I tried with the "existing" system, is to store the position as the nth occurrence of a particular word.
  3. How can we be most efficient?  Caching makes everything faster, obviously, but also gives us a huge memory footprint.  It would be possible to download the lexicon to a local copy every so often to make things faster.  I'm also toying with the idea of storing each document as a series of word records, rather than as a single record, and doing the lookup on the database side using a left outer join.
Your thoughts are, as ever, welcomed.

Profile

marnanel: (Default)
Monument

January 2022

S M T W T F S
      1
2345678
9101112131415
1617 1819202122
23242526272829
3031     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 12th, 2025 02:38 am
Powered by Dreamwidth Studios