Shawalphabet YahooGroup Archive Browser
From: Ethan <ethanl@...>
Date: 2006-05-02 19:26:04 #
Subject: Re: [shawalphabet] Re: Text conversion and homonyms
Toggle Shavian
Star Raven wrote:
>Or instead of just picking out curly brackets by eyeball, why not let
>the program pick them out last, sort of a hold until it's done with the
>easy part.
>
>--Star in stormy Mid-TN
>
>
>
Most text editors have a word search function that can easily locate any
word or character - this would avoid the need to do an eyeball search.
Yet having the program do it would be a good idea as well, as long as it
is capable of displaying the context, such as a page of text at a time.
This may be trivial with some programs, but one thing I thought of is a
command-line program which can be used with various front-ends such as a
web interface or gui, for simple drag-n-drop operations, or for people
who like to use the command line for various reasons, like Unix weenies
such as myself!
Now what's this about an idle duck?
--
Ethan
>--- Ethan <ethanl@...> wrote:
>
>
>
>>Thanks, Paul. Now, since I didn't catch a bunch of messages from
>>that
>>time last year, did anybody actually do anything with the suggestions
>>
>>that were made? Do we have such a thing as a Roman to Shavian
>>translation program? I've been pondering the possibility of making
>>such
>>a thing for a while now, even though I'm not much of a programmer
>>myself. I kind of like the idea of having a program which can
>>operate
>>with a designated substitution list, or will make a list by prompting
>>
>>the user if none has been supplied ahead of time.
>>
>>Embedding the alternates in the the output text, using curly
>>brackets,
>>sounds like a good idea to me. It requires no user intervention
>>during
>>the conversion process, thus makes it possible to run lots of text
>>through the convertor without interruption. Then the only thing
>>necessary is to open the text with a text editor and do a search for
>>curly brackets, and change each word to the one the context demands.
>>
>>I'm not quite sure I follow you regarding the adding of C, Q, and X
>>to
>>Shavian. Could you explain that a bit further?
>>
>>--
>>Ethan
>>
>>
>>
>>paul vandenbrink wrote:
>>
>>
>>
>>>Hi Ethan
>>>Better late than never.
>>>I think that there are relatively few English words
>>>that have two equally valid pronunciations inside a particular
>>>accent group.
>>>If we specify to the transliteration program, which accent group to
>>>convert the
>>>written T.O. English into, there will only be a small number of
>>>exceptions. If there is a exception, we can embed both Shavian
>>>spellings
>>>into the text with curly brackets around them.
>>>For example, the word perfect, becomes {pxf-ekt pD-fekt}
>>>
>>>I don't think this is much of hard-ship, and shouldn't prevent
>>>automatic translation.
>>>
>>>There is another issue. Some words (i.e. names, etc.) and
>>>abbreviations are not common enough to be registered in
>>>the Database. By the way there are a lot of duplicate abbreviations
>>>that represent different things. These
>>>other Words that are not successfully translated, should be marked
>>>with an asterisk and if possible retain their Roman spelling.
>>>
>>>How many additional letters, would have to be added to Shavian
>>>to allow a Roman equivalent pronunciation.
>>>Offhand, I would think you would only need a way to represent
>>>C, Q and X in Shavian. All the other Roman Letters have Shavian
>>>Equivalents.
>>>That would better than having to fiddle with 2 Fonts.
>>>
>>>Regards, Paul V.
>>>__________________________attached_______________________________
>>>--- In shawalphabet@yahoogroups.com, Ethan <ethanl@...> wrote:
>>>
>>>
>>>
>>>
>>>>>>Star Raven wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>what about the homonyms? PERfect vs. perFECT ect?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>
>>>
>>>
>>>
>>>>>>It should be PERF-ect (Verb) vs per-FECT (Adjective)
>>>>>>which not only has a different syllable boundary but also a
>>>>>>distinct Shaw Spelling.
>>>>>>
>>>>>> pxf-ekt ----- pD-fekt
>>>>>>
>>>>>>There are going to be homonyms in Shavian, perhaps even more
>>>>>>than T.O In Shavian, won = one
>>>>>>
>>>>>>Overall, it still will be miles ahead of T.O.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>
>>>
>>>
>>>
>>>>>This thread started out by suggesting that large amounts
>>>>>of literature could be transcribed by using a computer
>>>>>program and a look-up table. I believe Star's question
>>>>>was how to handle these homonyms in a look-up table.
>>>>>("one" vs. "won" is not a problem; only those which
>>>>>can have a different stress.)
>>>>>
>>>>>As someone else pointed out, natural language processing
>>>>>in English is not trivial. I would suggest that the look-up
>>>>>table simply have a flag on those words with more than one
>>>>>Shavian transcription. The program which does the trans-
>>>>>scription could flag those words in the output. Then a
>>>>>human could select the correct spelling. Thus 95% (or more)
>>>>>of the text would be automatic, with a minimal amount of
>>>>>human intervention .
>>>>>
>>>>>--Ph. D.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>My thoughts exactly. The flag can be as simple as having more than
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>one
>>>
>>>
>>>
>>>
>>>>solution to a word in the lookup table. Portions of the lookup
>>>>
>>>>
>>>>
>>>>
>>>table
>>>
>>>
>>>
>>>
>>>>might look like this:
>>>>
>>>>Douglas = duglas, Gduglas
>>>>douglas = duglas
>>>>dour = dQD
>>>>douse = dQs, dQz
>>>>dove = duv, dOv
>>>>dovecote = duvkOt
>>>>dovekie = duvkI
>>>>
>>>>
>>>>reactive = rIAktiv
>>>>reactively = rIAktivlI
>>>>reactiveness = rIAktivnas
>>>>reactivity = rIAktivitI
>>>>reactor = rIAktD
>>>>read = rId, red
>>>>
>>>>When the text conversion program comes across the word "reactor",
>>>>
>>>>
>>>>
>>>>
>>>it
>>>
>>>
>>>
>>>
>>>>outputs "rIAktD" and goes on to the next word. But when it finds
>>>>
>>>>
>>>>
>>>>
>>>the
>>>
>>>
>>>
>>>
>>>>word "read", it sees two choices, which can be nearly impossible to
>>>>
>>>>
>>>>ascertain from context even for people at times! The simplest
>>>>
>>>>
>>>>
>>>>
>>>solution
>>>
>>>
>>>
>>>
>>>>might be to pop up a dialog asking the user to choose the
>>>>
>>>>
>>>>
>>>>
>>>output: "rId"
>>>
>>>
>>>
>>>
>>>>or "red".
>>>>
>>>>Example:
>>>>"I read that book." Could be "F rId HAt bUk." or "F red HAt bUk."
>>>>
>>>>
>>>>
>>>>
>>
>>
>>
>
>
>=========>
>http://www.livejournal.com/users/wodentoad
>
>An idle duck is the devil's playground.
>
>
From: "kirk desimus" <kfs111@...>
Date: 2006-05-02 20:27:17 #
Subject: 1-1-2-3-5-8-13-21
Toggle Shavian
subJekt: 1-1-2-3-5-8-13-21
ov wot signifikAns yr HIz numberz?
From: Josh Goke <jocago_atl@...>
Date: 2006-05-02 21:30:37 #
Subject: Re: [shawalphabet] 1-1-2-3-5-8-13-21
Toggle Shavian
Ad Ic numbx tM H prIvWs wun.
kirk desimus <kfs111@...> wrote: subJekt: 1-1-2-3-5-8-13-21
ov wot signifikAns yr HIz numberz?
SPONSORED LINKS
Shaw rug Shaw carpets Corporate culture Business culture of china Shaw flooring Shaw florist
---------------------------------
YAHOO! GROUPS LINKS
Visit your group "shawalphabet" on the web.
To unsubscribe from this group, send an email to:
shawalphabet-unsubscribe@yahoogroups.com
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
---------------------------------
---------------------------------
New Yahoo! Messenger with Voice. Call regular phones from your PC and save big.
From: Star Raven <celestraof12worlds@...>
Date: 2006-05-02 22:37:27 #
Subject: Re: [shawalphabet] Re: Text conversion and homonyms
Toggle Shavian
Thanks, Ethan,
Just trying to make it easier for my lazy rear.
--Star
And it's just what my sig line says:
=========
http://www.livejournal.com/users/wodentoad
An idle duck is the devil's playground.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
From: Star Raven <celestraof12worlds@...>
Date: 2006-05-02 22:38:12 #
Subject: Re: [shawalphabet] 1-1-2-3-5-8-13-21
Toggle Shavian
Fibinocci... got a harder one?
--Star (who loves math)
--- kirk desimus <kfs111@...> wrote:
> subJekt: 1-1-2-3-5-8-13-21
>
> ov wot signifikAns yr HIz numberz?
>
>
>
>
=========
http://www.livejournal.com/users/wodentoad
An idle duck is the devil's playground.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
From: "paul vandenbrink" <pvandenbrink11@...>
Date: 2006-05-02 23:51:36 #
Subject: Re: Text conversion and homonyms
Toggle Shavian
Hi Ethan
Nice to see you are getting back into the swing of things.
Unfortuantely, I can't answer all your questions in one post.
I will try to reply to the remaining ones later.
First, is there such a thing as a Roman to Shavian
translation/transliteration program?
I think we are just developing the design at this point.
It would be a two step process.
As we discussed in my last post.
The first step is to translate all of the known words through
a program. The program would also mark all un-translated words
either with a leading and following Asterisk and all words with
multiple Shavian interpretation with the different Shavian words
enclosed in curly braces. This would be the rough translation.
Any abbreviations, mis-spellings or new words be flagged.
The rough translation would be run through another interactive
program,
where an educated literate English speaker would pick the appropriate
Shavian word from the list and expand the Abbreviations into normal
words or an acceptable abbreviation from the Shavian Database.
Your second question I am answering, is why was I asking
How many additional letters, would have to be added to Shavian
to allow a Roman equivalent pronunciation?
This question refers back to our, so far, imaginary Roman to Shavian
translation/transliteration program. In particular, the first part
that does the rough translation. If the program gets a mis-spelling
or some new English word or Abbreviation, that it can not
automatically figure
out an equivalent Shavian translation for, it should pass the word
through untouched in Shavian letters. That would better than having
to fiddle with 2 Fonts.
The Roman Letter equivalents in Shavian are as follows:
Roman Shavian
a => Age
b => Bib
c => ?
d => Dead
e => Eat
f => Fee
g => Gag
h => Ha-ha
i => Ice
j => Judge
k => Kick
l => Loll
m => Mime
n => Nun
o => Oak
p => Peep
q => ?
r => Roar
s => So
t => Tot
u => Yew
v => Vow
w => Woe
x => ?
y => Yea
z => Zoo
> Offhand, I would think we would only need to find a way to
> represent the C, Q and X Roman Letters in Shavian.
> All the other Roman Letters have Shavian
> Equivalents. See above.
> Then we wouldn't have to fiddle with 2 different Fonts.
Regards, Paul V.
P.S. I suppose we could use the # (Pound), $ (Dollar) and % (Percent)
signs to represent Q, C, and X respectively.
They are not used much in normal English.
Any other suggestions? Concerning the way the rough transliteration
should look that is.
_______________attached________________________
--- In shawalphabet@yahoogroups.com, Ethan <ethanl@...> wrote:
Now, since I didn't catch a bunch of messages from that
time last year, did anybody actually do anything with the suggestions
that were made?
Do we have such a thing as a Roman to Shavian
> translation program? I've been pondering the possibility of making
such
> a thing for a while now, even though I'm not much of a programmer
> myself. I kind of like the idea of having a program which can
operate
> with a designated substitution list, or will make a list by
prompting
> the user if none has been supplied ahead of time.
>
> Embedding the alternates in the the output text, using curly
brackets,
> sounds like a good idea to me. It requires no user intervention
during
> the conversion process, thus makes it possible to run lots of text
> through the convertor without interruption. Then the only thing
> necessary is to open the text with a text editor and do a search
for
> curly brackets, and change each word to the one the context demands.
>
> I'm not quite sure I follow you regarding the adding of C, Q, and X
to
> Shavian. Could you explain that a bit further?
From: RSRICHMOND@...
Date: 2006-05-03 00:23:14 #
Subject: Re: [shawalphabet] 1-1-2-3-5-8-13-21
Toggle Shavian
The fabled Fibonacci series. Each number in the series is the sum of the two
previous numbers.
Bob Richmond
From: "paul vandenbrink" <pvandenbrink11@...>
Date: 2006-05-03 01:37:51 #
Subject: Re: Text conversion and homonyms
Toggle Shavian
Thanks Star
That's more or less what we are going to do.
Break the process down into two steps.
First we run the Roman letters through a rough sieve and
get a rough transliteration.
Then we take that rough cut and run it through
an interactive process that
requires a litte Human intervention.
It kind of the way, Ma Bell deals with you when you phone 411
for Information now-a-days.
Seem to be getting better.
Their program remembers and gets better with time.
It's an efficient way of dealing with a huge amount of Data.
Which is what we are talking about here.
Regards, Paul V.
_______________attached________________________
--- In shawalphabet@yahoogroups.com, Star Raven
<celestraof12worlds@...> wrote:
>
> Or instead of just picking out curly brackets by eyeball, why not
let
> the program pick them out last, sort of a hold until it's done with
the
> easy part.
>
> --Star in stormy Mid-TN
>
> --- Ethan <ethanl@...> wrote:
>
> > Thanks, Paul. Now, since I didn't catch a bunch of messages from
> > that
> > time last year, did anybody actually do anything with the
suggestions
> >
> > that were made? Do we have such a thing as a Roman to Shavian
> > translation program? I've been pondering the possibility of
making
> > such
> > a thing for a while now, even though I'm not much of a programmer
> > myself. I kind of like the idea of having a program which can
> > operate
> > with a designated substitution list, or will make a list by
prompting
> >
> > the user if none has been supplied ahead of time.
> >
> > Embedding the alternates in the the output text, using curly
> > brackets,
> > sounds like a good idea to me. It requires no user intervention
> > during
> > the conversion process, thus makes it possible to run lots of
text
> > through the convertor without interruption. Then the only thing
> > necessary is to open the text with a text editor and do a search
for
> > curly brackets, and change each word to the one the context
demands.
> >
> > I'm not quite sure I follow you regarding the adding of C, Q, and
X
> > to
> > Shavian. Could you explain that a bit further?
> >
> > --
> > Ethan
> >
> >
> >
> > paul vandenbrink wrote:
> >
> > >Hi Ethan
> > >Better late than never.
> > >I think that there are relatively few English words
> > >that have two equally valid pronunciations inside a particular
> > >accent group.
> > >If we specify to the transliteration program, which accent group
to
> > >convert the
> > >written T.O. English into, there will only be a small number of
> > >exceptions. If there is a exception, we can embed both Shavian
> > >spellings
> > >into the text with curly brackets around them.
> > >For example, the word perfect, becomes {pxf-ekt pD-fekt}
> > >
> > >I don't think this is much of hard-ship, and shouldn't prevent
> > >automatic translation.
> > >
> > >There is another issue. Some words (i.e. names, etc.) and
> > >abbreviations are not common enough to be registered in
> > >the Database. By the way there are a lot of duplicate
abbreviations
> > >that represent different things. These
> > >other Words that are not successfully translated, should be
marked
> > >with an asterisk and if possible retain their Roman spelling.
> > >
> > >How many additional letters, would have to be added to Shavian
> > >to allow a Roman equivalent pronunciation.
> > >Offhand, I would think you would only need a way to represent
> > >C, Q and X in Shavian. All the other Roman Letters have Shavian
> > >Equivalents.
> > >That would better than having to fiddle with 2 Fonts.
> > >
> > >Regards, Paul V.
> > >__________________________attached_______________________________
> > >--- In shawalphabet@yahoogroups.com, Ethan <ethanl@> wrote:
> > >
> > >
> > >>>>Star Raven wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>>>what about the homonyms? PERfect vs. perFECT ect?
> > >>>>>
> > >>>>>
> > >
> > >
> > >
> > >>>>It should be PERF-ect (Verb) vs per-FECT (Adjective)
> > >>>>which not only has a different syllable boundary but also a
> > >>>>distinct Shaw Spelling.
> > >>>>
> > >>>> pxf-ekt ----- pD-fekt
> > >>>>
> > >>>>There are going to be homonyms in Shavian, perhaps even more
> > >>>>than T.O In Shavian, won = one
> > >>>>
> > >>>>Overall, it still will be miles ahead of T.O.
> > >>>>
> > >>>>
> > >
> > >
> > >
> > >>>This thread started out by suggesting that large amounts
> > >>>of literature could be transcribed by using a computer
> > >>>program and a look-up table. I believe Star's question
> > >>>was how to handle these homonyms in a look-up table.
> > >>>("one" vs. "won" is not a problem; only those which
> > >>>can have a different stress.)
> > >>>
> > >>>As someone else pointed out, natural language processing
> > >>>in English is not trivial. I would suggest that the look-up
> > >>>table simply have a flag on those words with more than one
> > >>>Shavian transcription. The program which does the trans-
> > >>>scription could flag those words in the output. Then a
> > >>>human could select the correct spelling. Thus 95% (or more)
> > >>>of the text would be automatic, with a minimal amount of
> > >>>human intervention .
> > >>>
> > >>>--Ph. D.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>My thoughts exactly. The flag can be as simple as having more
than
> >
> > >>
> > >>
> > >one
> > >
> > >
> > >>solution to a word in the lookup table. Portions of the
lookup
> > >>
> > >>
> > >table
> > >
> > >
> > >>might look like this:
> > >>
> > >>Douglas = duglas, Gduglas
> > >>douglas = duglas
> > >>dour = dQD
> > >>douse = dQs, dQz
> > >>dove = duv, dOv
> > >>dovecote = duvkOt
> > >>dovekie = duvkI
> > >>
> > >>
> > >>reactive = rIAktiv
> > >>reactively = rIAktivlI
> > >>reactiveness = rIAktivnas
> > >>reactivity = rIAktivitI
> > >>reactor = rIAktD
> > >>read = rId, red
> > >>
> > >>When the text conversion program comes across the
word "reactor",
> > >>
> > >>
> > >it
> > >
> > >
> > >>outputs "rIAktD" and goes on to the next word. But when it
finds
> > >>
> > >>
> > >the
> > >
> > >
> > >>word "read", it sees two choices, which can be nearly
impossible to
> >
> > >>ascertain from context even for people at times! The simplest
> > >>
> > >>
> > >solution
> > >
> > >
> > >>might be to pop up a dialog asking the user to choose the
> > >>
> > >>
> > >output: "rId"
> > >
> > >
> > >>or "red".
> > >>
> > >>Example:
> > >>"I read that book." Could be "F rId HAt bUk." or "F red HAt
bUk."
> > >>
> > >>
> >
> >
> >
>
>
> ==========
>
> http://www.livejournal.com/users/wodentoad
>
> An idle duck is the devil's playground.
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
From: "Philip Newton" <philip.newton@...>
Date: 2006-05-03 05:54:09 #
Subject: Re: [shawalphabet] Re: Text conversion and homonyms
Toggle Shavian
On 5/2/06, jocago_atl <jocago_atl@...> wrote:
> I have been considering this very topic for a while now. The only
> thing holding me back is my overall lack of time. My thought was to
> take an existing phonetic breakdown of English and use that to
> transliterate between latinized English and Shavian. The dataset I was
> considering is from here:
> http://www.speech.cs.cmu.edu/cgi-bin/cmudict. Has anyone looked at
> this approach before?
I'd considered doing something like that before as well, but never got
around to doing much.
One "problem" that I had was the lack of what I considered a suitable
pronouncing dictionay; both the CMU dictionary you mention and the
Moby Pronunciator ( http://www.dcs.shef.ac.uk/research/ilash/Moby/ ,
or search for "moby pronunciator" to get several alternative pages
offering it) use some American variety, which doesn't make all the
distinctions I make when writing Shavian.
Cheers,
--
Philip Newton <philip.newton@gmail.com>
From: RSRICHMOND@...
Date: 2006-05-03 10:23:25 #
Subject: Re: Text conversion and homonyms
Toggle Shavian
Philip Newton notes:
>>One "problem" that I had was the lack of what I considered a suitable
pronouncing dictionary; both the CMU dictionary you mention and the
Moby Pronunciator ( http://www.dcs.shef.ac.uk/research/ilash/Moby/ ,
or search for "moby pronunciator" to get several alternative pages
offering it) use some American variety, which doesn't make all the
distinctions I make when writing Shavian.<<
I use the Oxford English Dictionary for British Received Pronunciation, which
is the standard ("his late Majesty George V") for Androcles. I have the paper
edition, but it's available on CD-ROM. Though normally I write Shaw Alphabet
in my native Central-Western North American idiolect.
Bob Richmond