marnanel | Shavian and disambiguation

I mentioned earlier about an idea I had for automatic part of speech disambiguation based only on the part of speech of the preceding word. I also mentioned that I believe this would be a workable solution for disambiguating the pronunciation of most homonyms.

I would therefore like to create a distributable database which mapped conventional spellings of English words either to (part of speech, phonemic representation) pairs, or (in the case of ambiguous spellings) to a mapping from sets of parts of speech to such pairs; the part of speech of the previous word would be used in choosing the new one.

Sources of data would be:

the Shavian wiki, where possible (licence is cc-by)
cmudict where the Shavian wiki wasn't possible (licence is BSD-like)
the Brown tagger for the parts of speech (licence is MIT)

So, two things I need to consider:

what this database would be called
how to evaluate it.

I think one way to evaluate it might be to take a corpus which is already POS-tagged, and evaluate it by:

assuming all words are nouns
assuming all words which the Shavian wiki believes are ambiguous are nouns, and using the Brown tagger for the rest
using the POS-of-the-previous-word method outlined above
using the Brown tagger

and checking that (3) is closer to (4) than (2). Other ideas are welcome, of course.