Shawalphabet YahooGroup Archive Browser

From: "Yahya" <yahya@...>
Date: 2009-04-01 17:12:22 #
Subject: Re: A bialphabetic experiment

Toggle Shavian
--- In shawalphabet@yahoogroups.com, "Thomas Thurman" <tthurman@...> wrote:
>
> I have created a wiki at http://shavian.marnanel.org/ .
>
> It uses the same software as Wikipedia, but with one custom extension. Pages whose name begins "Document:" are automatically transliterated on the fly into the Shavian alphabet, by looking up the words in the wiki itself. Examples:
>
> http://shavian.marnanel.org/index.php/Document:When_I_was_one_and_twenty
>
> http://shavian.marnanel.org/index.php/Document:Androcles_and_the_Lion/Act_I
>
> Red words which aren't in the lexicon can be added, and blue words which are can be altered, simply by clicking on them. Editing the page will reveal that it's still in the Latin alphabet underneath.
>
> I would be happy and honoured if a few people could come and build this up with me. Every word added to any document is automatically transliterated in every other, so the more public domain documents we copy from Wikisource and elsewhere the more the lexicon will learn and the larger a library we will have.
>
> I'm not publicising this other than here yet, since it's very experimental. I'm looking forward to seeing how far it can go.
>
> Thomas

Wonderful stuff, Marnanel! Wishing you success in this venture,
Yahya

From: dshep <dshepx@...>
Date: 2009-04-01 19:47:43 #
Subject: re: A bialphabetic experiment

Toggle Shavian
Very impressive, Thomas

Would it be possible to make the Shaw text larger?

in awe,
dshep

From: Philip Newton <philip.newton@...>
Date: 2009-04-01 20:14:50 #
Subject: Re: [shawalphabet] A bialphabetic experiment

Toggle Shavian
2009/4/1 Thomas Thurman <tthurman@...>:
> I have created a wiki at http://shavian.marnanel.org/ .
>
> It uses the same software as Wikipedia, but with one custom extension.  Pages whose name begins "Document:" are automatically transliterated on the fly into the Shavian alphabet, by looking up the words in the wiki itself.

Awesome idea. Thank you.

Cheers,
--
Philip Newton <philip.newton@...>

From: "ed_shapard" <ed_shapard@...>
Date: 2009-04-01 22:44:16 #
Subject: Re: A bialphabetic experiment

Toggle Shavian
I've had some success with this using the css font-size-adjust tag.

http://www.w3schools.com/CSS/pr_font_font-size-adjust.asp

--- In shawalphabet@yahoogroups.com, dshep <dshepx@...> wrote:
>
> Very impressive, Thomas
>
> Would it be possible to make the Shaw text larger?
>
> in awe,
> dshep
>

From: "Thomas Thurman" <tthurman@...>
Date: 2009-04-01 23:00:21 #
Subject: Re: A bialphabetic experiment

Toggle Shavian
--- In shawalphabet@yahoogroups.com, dshep <dshepx@...> wrote:
>
> Very impressive, Thomas

Thank you!

> Would it be possible to make the Shaw text larger?

Do you want the Shaw text larger and not the Latin text, or is it just as good to make all the text larger together?

T

From: "ed_shapard" <ed_shapard@...>
Date: 2009-04-01 23:13:42 #
Subject: Re: State of play with vowels and CMUDict

Toggle Shavian
Awesome work. You rock!

oh, but strangely, "array" gets transliterated with 'ERR' instead of 'Array'... I checked my version of CMU dict and it lists array as: ER0 EY1 = "hurt"(no stress) + "ate"(primary stress)

that should transliterate to Array + Age. 𐑩𐑱

I got the same results with "manner"

--- In shawalphabet@yahoogroups.com, "Thomas Thurman" <tthurman@...> wrote:
>
> I have changed my transliterator to differentiate Up and Ado:
>
> http://marnanel.org/shavian/transliterate?l=up+ado
>
> This also works for Err and Array:
>
> http://marnanel.org/shavian/transliterate?l=the+array+errs
>
> This change has not worked through to the public version of the Perl module yet, but that's coming soon.
>
> I have not yet added homonym support. This is on the cards.
>
> The vowel mapping is now:
>
> IH = if
> EH = egg
> AE = ash
> AH = ado (in syllables without primary stress)
> AO = on
> UH = wool
> AW = out
> AA = ah
> IY = eat
> EY = age
> AY = ice
> AH = up (in syllables with primary stress)
> OW = oak
> UY = ooze
> OY = oil
> Nothing = awe
>
> Up and Awe are not differentiated by CMUDict because of the father/bother merger:
>
> http://marnanel.org/shavian/transliterate?l=Ah+I+am+in+awe
>
> Currently I make them both Ah. A solution to this is suggested in my next post.
>
> Thomas
>

From: "ed_shapard" <ed_shapard@...>
Date: 2009-04-02 01:09:01 #
Subject: Re: State of play with vowels and CMUDict

Toggle Shavian
Thomas,

I went through your vowel mappings for CMU dict, and there are some that differ from what I came up with. Namely; AO, AA, and UW.

My CMU dict version is 0.3 dated 9-7-94

AO
listed as "ought" - merriam webster uses the same symbol for this vowel as the symbol they use for "caught" and "awe", so I transliterate this as AWE, not ON

AA
listed as "odd" - merriam webster uses the same symbol for the vowel in "odd" as the vowel in "on", so I transliterate this as ON, not AH

----- So if I'm correct about those last two, CMUdict differentiates between "cot" and "caught" i.e. no cot caught merger

UY
I don't have a UY symbol, but I do have a UW, maybe this is a typo, or we have different versions...

---- I believe that the "father bother" merger is a merge of ON with AH. Merriam Webster shows the same vowel sounds for both "father" and "bother", so it isn't any help. But dictionary.reference.com shows different IPA symbols for each word.
AA = "odd" (in CMU). dictionary.reference.com shows "odd" as having the same vowel as "bother" and "on", but not "father".
For some reason, I'm under the impression that the "ON" phoneme is much more common than the "AH" phoneme, so I'd transliterate AA as ON, instead of AH, and then work on finding AH words and updating them.

So if I'm right, then the only merger we have to worry about with CMUdict is the "father-bother" merger, and the bother/ON sound is the more common. So what we really need is a list of AH words.

--- In shawalphabet@yahoogroups.com, "ed_shapard" <ed_shapard@...> wrote:
>
> Awesome work. You rock!
>
> oh, but strangely, "array" gets transliterated with 'ERR' instead of 'Array'... I checked my version of CMU dict and it lists array as: ER0 EY1 = "hurt"(no stress) + "ate"(primary stress)
>
> that should transliterate to Array + Age. 𐑩𐑱
>
> I got the same results with "manner"
>
> --- In shawalphabet@yahoogroups.com, "Thomas Thurman" <tthurman@> wrote:
> >
> > I have changed my transliterator to differentiate Up and Ado:
> >
> > http://marnanel.org/shavian/transliterate?l=up+ado
> >
> > This also works for Err and Array:
> >
> > http://marnanel.org/shavian/transliterate?l=the+array+errs
> >
> > This change has not worked through to the public version of the Perl module yet, but that's coming soon.
> >
> > I have not yet added homonym support. This is on the cards.
> >
> > The vowel mapping is now:
> >
> > IH = if
> > EH = egg
> > AE = ash
> > AH = ado (in syllables without primary stress)
> > AO = on
> > UH = wool
> > AW = out
> > AA = ah
> > IY = eat
> > EY = age
> > AY = ice
> > AH = up (in syllables with primary stress)
> > OW = oak
> > UY = ooze
> > OY = oil
> > Nothing = awe
> >
> > Up and Awe are not differentiated by CMUDict because of the father/bother merger:
> >
> > http://marnanel.org/shavian/transliterate?l=Ah+I+am+in+awe
> >
> > Currently I make them both Ah. A solution to this is suggested in my next post.
> >
> > Thomas
> >
>

From: "ed_shapard" <ed_shapard@...>
Date: 2009-04-02 01:30:52 #
Subject: Re: State of play with vowels and CMUDict

Toggle Shavian
Here's the list of CMUdict symbol mappings that I came up with.
Apart from most of the ligatures (are, or, air, ear, ian, and yew, the only shavian character not mapped is AH.

AA0 on
AA1 on
AA2 on
AE0 ash
AE1 ash
AE2 ash
AH0 ado
AH1 up
AH2 ado
AO0 awe
AO1 awe
AO2 awe
AW0 out
AW1 out
AW2 out
AY0 ice
AY1 ice
AY2 ice
B bib
CH church
D dead
DH they
EH0 egg
EH1 egg
EH2 egg
ER0 array
ER1 err
ER2 array
EY0 age
EY1 age
EY2 age
F fee
G gag
HH ha-ha
IH0 if
IH1 if
IH2 if
IY0 eat
IY1 eat
IY2 eat
JH judge
K kick
L loll
M mime
N nun
NG hung
OW0 oak
OW1 oak
OW2 oak
OY0 oil
OY1 oil
OY2 oil
P peep
R roar
S so
SH sure
T tot
TH thigh
UH0 wool
UH1 wool
UH2 wool
UW0 ooze
UW1 ooze
UW2 ooze
V vow
W woe
Y yea
Z zoo
ZH measure

--- In shawalphabet@yahoogroups.com, "ed_shapard" <ed_shapard@...> wrote:
>
> Thomas,
>
> I went through your vowel mappings for CMU dict, and there are some that differ from what I came up with. Namely; AO, AA, and UW.
>
> My CMU dict version is 0.3 dated 9-7-94
>
> AO
> listed as "ought" - merriam webster uses the same symbol for this vowel as the symbol they use for "caught" and "awe", so I transliterate this as AWE, not ON
>
> AA
> listed as "odd" - merriam webster uses the same symbol for the vowel in "odd" as the vowel in "on", so I transliterate this as ON, not AH
>
> ----- So if I'm correct about those last two, CMUdict differentiates between "cot" and "caught" i.e. no cot caught merger
>
> UY
> I don't have a UY symbol, but I do have a UW, maybe this is a typo, or we have different versions...
>
> ---- I believe that the "father bother" merger is a merge of ON with AH. Merriam Webster shows the same vowel sounds for both "father" and "bother", so it isn't any help. But dictionary.reference.com shows different IPA symbols for each word.
> AA = "odd" (in CMU). dictionary.reference.com shows "odd" as having the same vowel as "bother" and "on", but not "father".
> For some reason, I'm under the impression that the "ON" phoneme is much more common than the "AH" phoneme, so I'd transliterate AA as ON, instead of AH, and then work on finding AH words and updating them.
>
> So if I'm right, then the only merger we have to worry about with CMUdict is the "father-bother" merger, and the bother/ON sound is the more common. So what we really need is a list of AH words.
>
> --- In shawalphabet@yahoogroups.com, "ed_shapard" <ed_shapard@> wrote:
> >
> > Awesome work. You rock!
> >
> > oh, but strangely, "array" gets transliterated with 'ERR' instead of 'Array'... I checked my version of CMU dict and it lists array as: ER0 EY1 = "hurt"(no stress) + "ate"(primary stress)
> >
> > that should transliterate to Array + Age. 𐑩𐑱
> >
> > I got the same results with "manner"
> >
> > --- In shawalphabet@yahoogroups.com, "Thomas Thurman" <tthurman@> wrote:
> > >
> > > I have changed my transliterator to differentiate Up and Ado:
> > >
> > > http://marnanel.org/shavian/transliterate?l=up+ado
> > >
> > > This also works for Err and Array:
> > >
> > > http://marnanel.org/shavian/transliterate?l=the+array+errs
> > >
> > > This change has not worked through to the public version of the Perl module yet, but that's coming soon.
> > >
> > > I have not yet added homonym support. This is on the cards.
> > >
> > > The vowel mapping is now:
> > >
> > > IH = if
> > > EH = egg
> > > AE = ash
> > > AH = ado (in syllables without primary stress)
> > > AO = on
> > > UH = wool
> > > AW = out
> > > AA = ah
> > > IY = eat
> > > EY = age
> > > AY = ice
> > > AH = up (in syllables with primary stress)
> > > OW = oak
> > > UY = ooze
> > > OY = oil
> > > Nothing = awe
> > >
> > > Up and Awe are not differentiated by CMUDict because of the father/bother merger:
> > >
> > > http://marnanel.org/shavian/transliterate?l=Ah+I+am+in+awe
> > >
> > > Currently I make them both Ah. A solution to this is suggested in my next post.
> > >
> > > Thomas
> > >
> >
>

From: dshep <dshepx@...>
Date: 2009-04-02 04:01:01 #
Subject: re: a bialphabetic experiment

Toggle Shavian
--- In shawalphabet@ yahoogroups. com, thomas wrote:



>  Do you want the Shaw text larger and not the Latin text, or is it just as
>  good to make all the text larger together?

Just the Shaw text, which in my browser (Firefox) appears rather small.
Everything else is fine.

dshep

From: "Thomas Thurman" <tthurman@...>
Date: 2009-04-02 17:58:53 #
Subject: Re: A bialphabetic experiment

Toggle Shavian
In my lunchbreak today I added something that's clearly been needed: homonym support. You can mark a word as a homonym, and it will be flagged in the text so you can tag it to disambiguate it properly.

http://shavian.marnanel.org/index.php/Shavian:Dab has more details.

T