Shavian eGroup Archive Browser
From: Newton, Philip
Date: 2001-10-10 13:54:05 #
Subject: Re: [shavian] Phoneme Frequencies
Toggle Shavian
Mitchell Perilstein wrote:
> Can anyone suggest a useful, modern corpus?
The Calgary text-compression corpus, or whatever its successor is called?
> - The Gutenburg archive and other free texts have old
> writings in them (wherefore art thou...).
But not all of Gutenberg is that old-fashioned, I think.
> plain text or HTML
Well, if you're looking for a phoneme distribution, it really needs to be
marked up in phonemes and not traditional orthography. Automatic translation
by reference to a dictionary can't tell you whether "read" rhymes with
"reed" or with "red", so you don't know whether that's a count for (Shavian)
/I/ or (Shavian) /e/. Or "invalid" being /invAlid/ (not valid) or /invalid/
(handicapped). So getting a standard corpus only solves half the problem if
it's in traditional orthography.
Getting a phoneme count on _Androcles_ would be a start, though. That's
already marked up in phonemes.
Cheers,
Philip
--
Philip Newton <Philip.Newton@...>
All opinions are my own, not my employer's.
If you're not part of the solution, you're part of the precipitate.
------------------------ Yahoo! Groups Sponsor ---------------------~-->
FREE COLLEGE MONEY
CLICK HERE to search
600,000 scholarships!
http://us.click.yahoo.com/Pv4pGD/4m7CAA/ySSFAA/mx3olB/TM
---------------------------------------------------------------------~->
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
From: Robert McBroom
Date: 2001-10-11 01:11:16 #
Subject: Re: [shavian] Phoneme Frequencies
Toggle Shavian
Philip is right. There's no value in collecting statistics from any document that not written in Shavian.
But I'm afraid the beloved /andrOklIz hardly satisfies as a useful text either. Like every play, it's most common words are the characters' names, which lead off every paragraph. At least they would have to be eliminated to produce any data remotely useful.
I have Shavianed the first twenty pages of the gargantuan Preface to /andrOklIz, and even run some character counts on it, but it is so frought with the peculiar words with which Shaw builds his argument (Christianity, redemption, salvation, atonement, Gospels etc.) that it hardly qualifies as a "general text." for statistical purposes.
Add to that that it's my middle American accent that did the transliterating, and not good King George's, and what have you got of use?
/not muc.
Well, if you're looking for a phoneme distribution, it really needs to be
marked up in phonemes and not traditional orthography. Automatic translation
by reference to a dictionary can't tell you whether "read" rhymes with
"reed" or with "red", so you don't know whether that's a count for (Shavian)
/I/ or (Shavian) /e/. Or "invalid" being /invAlid/ (not valid) or /invalid/
(handicapped). So getting a standard corpus only solves half the problem if
it's in traditional orthography.
Getting a phoneme count on _Androcles_ would be a start, though. That's
already marked up in phonemes.
Cheers,
Philip
--
Philip Newton <Philip.Newton@...>
All opinions are my own, not my employer's.
If you're not part of the solution, you're part of the precipitate.
Yahoo! Groups Sponsor ADVERTISEMENT
Your use of Yahoo! Groups is subject to the <http://rd.yahoo.com/M=194081.1637497.3177299.1261774/D=egroupweb/S=1705213030:HM/A=793313/R=1/*http://www.ediets.com/start.cfm?code=3258> Yahoo! Terms of Service <http://docs.yahoo.com/info/terms/> .
--
- /bob /mk/brMm
/wUdstak /nV /jDk
"wun simpol iz az gUd Az anuHD prOvFdid
evriwun atAcez H sEm mIniN tM it."
- /gPJ /bxnRd /SY
Yahoo! Groups Sponsor
ADVERTISEMENT
<http://rd.yahoo.com/M=168643.1620686.3205344.1261774/D=egroupweb/S=1705213030:HM/A=816901/R=5/*http://www.overstock.com/cgi-bin/d2.cgi?cid=12973>
<http://rd.yahoo.com/M=168643.1620686.3205344.1261774/D=egroupweb/S=1705213030:HM/A=816901/R=6/*http://www.overstock.com/cgi-bin/d2.cgi?cid=12973>
<http://us.adserver.yahoo.com/l?M=168643.1620686.3205344.1261774/D=egroupmail/S=1705213030:HM/A=816901/rand=873807183>
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service <http://docs.yahoo.com/info/terms/> .
From: Mitchell Perilstein
Date: 2001-10-11 06:58:21 #
Subject: Re: [shavian] Phoneme Frequencies
Toggle Shavian
On Wed, 10 Oct 2001 19:15:19 -0400
Robert McBroom <info@...> wrote:
| Philip is right. There's no value in collecting statistics from any
| document that not written in Shavian.
Good idea, you guys, to look at Shavian'ed docs. Fortunately, there ARE some shavian'ed docs
in various places on the net aside from Androcles. When I get a chance I will write some code
to scan them and report what I find.
Another thought is to identify all the homographs in the phonemic dictionary and attempt to correct
for them, either by discarding them or by attempting to determine which was meant by the context;
a large undertaking. But then the dictionary could be used against any text.
Best,
--
Mitchell Perilstein
mitch@...
www.enetis.net/~mitch
+1 (605) 574-2367
------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get your FREE credit report with a FREE CreditCheck
Monitoring Service trial
http://us.click.yahoo.com/Gi0tnD/bQ8CAA/ySSFAA/mx3olB/TM
---------------------------------------------------------------------~->
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
From: Ewout Stam
Date: 2001-10-14 14:57:10 #
Subject: Re: [shavian] Phoneme Frequencies
Toggle Shavian
-----Oorspronkelijk bericht-----
Van: Mitchell Perilstein <mitch@...>
Aan: shavian@... <shavian@...>
Datum: dinsdag 9 oktober 2001 20:11
Onderwerp: Re: [shavian] Phoneme Frequencies
>On Tue, 09 Oct 2001 08:55:23 -0600
>Lee Hickenlooper <leehickenlooper@...> wrote:
>
> | I would like to see results of a frequency count compiled from
phonemes
> | as they are really used. Also, I'd like to see a frequency count of
> | phonemes, diphones, triphones, word combinations and phrases compiled.
>
>Agreed. The dict answered how often do they occur in the dict, not really
how
>often they occur in writing. It would be easy to extend the dict program
to do this.
>
>Can anyone suggest a useful, modern corpus? It seems the problem is
>how to find one that embodies how *most* people write:
> - The Gutenburg archive and other free texts have old writings in them
(wherefore art thou...).
> - Usenet postings have lots of modern writing, but lots of non-native
speakers' writings and
> lots of those who chose not to (War3Z, D00dZ) and spam (FREE SEX!!!)
> - We could grab the news wires, but who writes like journalists? Maybe if
we skip headlines.
>
I once did something like this for letter frequency in Dutch texts. I took a
bunch of texts from a typing course program and a few help files and had
computer programs count the amount of letters. I renembered the most
occurring letters, because they formed a pronouncable 'word': Enatirodls.
Typing course texts are texts about a variety of subjects and writing
styles, so you might want to take a look at something like that.
Ewout
------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get your FREE credit report with a FREE CreditCheck
Monitoring Service trial
http://us.click.yahoo.com/Gi0tnD/bQ8CAA/ySSFAA/mx3olB/TM
---------------------------------------------------------------------~->
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
From: Levi der Eroberer
Date: 2001-10-15 23:32:52 #
Subject: [shavian] Re: Shaw Script
Toggle Shavian
--- In shavian@y..., "Robert S. Reeser" <reeser@c...> wrote:
> Does anyone have a copy of Andy Callaway's ShawScript TrueType font? It is
> no longer avaiable at his site, and he tells me he is having difficulty
> finding it on his system.
>
> If anyone would be willing to e-mail me a copy, I'd greatly appreciate it.
Actually, it's within 4 clicks of the mouse.
Click "Files" on the left, then navigate through "Fonts" and "Windows." Click on
Shawscrp.ttf . The file should start downloading. If not, try right-clicking on the name
of the file.
------------------------ Yahoo! Groups Sponsor ---------------------~-->
FREE COLLEGE MONEY
CLICK HERE to search
600,000 scholarships!
http://us.click.yahoo.com/Pv4pGD/4m7CAA/ySSFAA/mx3olB/TM
---------------------------------------------------------------------~->
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
From: Levi der Eroberer
Date: 2001-10-15 23:48:28 #
Subject: [shavian] Re: Help! I'm an American!
Toggle Shavian
--- In shavian@y..., Robert McBroom <info@o...> wrote:
> Your posting reminds me that - in my few months as a member - I have
> noticed that we all have our particular "blind spots"
>
> I too am rhotic American, and never had an issue with ah and awe.
> For me, when the doctor sticks the tongue depressor in your mouth he
> says "sE a!". And when the crow makes noise he "kYz".
>
> My own personal blind spot is x w and D which certainly sound
> identical to my ears. Can you - /jAkI t /jAkF - help me with this
> distinction?
The difference between err and array is stress. Err is stressed and array isn't. In
British speech, they are pronounced differently, but for Americans the stress rule is
sufficient.
However, in my dialect, up and ado are separate phonemes. Ado is always
unstressed, but up also occurs in unstressed positions (like "unstressed" and
"peanut", which aren't anstrest and pInat, but unstrest and pInut.
Ah, on and awe are one and the same in my dialect, but I have learned to
differentiate ah/on and awe in speech (mainly by flipping thru dictionary
pronunciation guides and figuring out the rules) and all three in Shavian text. I found
that "awe" occurs under these circumstances:
If you know one of the three will be used, remember these rules to select which one
it is:
On: Use as the short "o" and in "ough"-combinations. (not, fought, cough)
Ah: Usually spelt "a" as in 'father' and 'yacht' (garage, llama)
Awe: Always use for 'au' and 'aw' combinations, 'al' or 'all' combinations, and the
words 'broad' and 'Broadway' (Australia, raw, altar, ball)
BTW, how do you change the font to show Shavian characters instead of Roman
ones?
------------------------ Yahoo! Groups Sponsor ---------------------~-->
FREE COLLEGE MONEY
CLICK HERE to search
600,000 scholarships!
http://us.click.yahoo.com/Pv4pGD/4m7CAA/ySSFAA/mx3olB/TM
---------------------------------------------------------------------~->
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
From: Philip Newton
Date: 2001-10-17 05:06:46 #
Subject: Re: [shavian] Re: Help! I'm an American!
Toggle Shavian
On 15 Oct 01, at 22:47, Levi der Eroberer wrote:
> If you know one of the three will be used, remember these rules to select which one
> it is:
> On: Use as the short "o" and in "ough"-combinations. (not, fought, cough)
> Ah: Usually spelt "a" as in 'father' and 'yacht' (garage, llama)
> Awe: Always use for 'au' and 'aw' combinations, 'al' or 'all' combinations, and the
> words 'broad' and 'Broadway' (Australia, raw, altar, ball)
"ough" for me takes the "awe" vowel. "caught", "court", and "fought"
rhyme for me. (But "cough" rhymes with "off" and takes the "on" vowel.)
And I say "Ostraylia" and not "Awestraylia". (And my "yacht" rhymes
with "pot".)
Cheers,
Philip
--
Philip Newton <Philip.Newton@...>
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
From: Simon Barne
Date: 2001-10-17 16:23:00 #
Subject: [shavian] English too hard
Toggle Shavian
"The complex syllable structure and spelling of English are the likely
reasons why children are slower at learning basic reading and writing
skills in English than in any other European language, according to new
research. The study by Professor Philip Seymour, from Dundee University,
of 700 primary school children in 15 European countries showed the
English speaking sample took two and a half years to master "the basic
foundation elements of literacy" compared with a year elsewhere."
From: http://education.guardian.co.uk
------------------------ Yahoo! Groups Sponsor ---------------------~-->
FREE COLLEGE MONEY
CLICK HERE to search
600,000 scholarships!
http://us.click.yahoo.com/Pv4pGD/4m7CAA/ySSFAA/mx3olB/TM
---------------------------------------------------------------------~->
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
From: Ewout Stam
Date: 2001-10-20 10:22:08 #
Subject: Re: [shavian] Re: Help! I'm an American!
Toggle Shavian
-----Oorspronkelijk bericht-----
Van: Philip Newton <philip.newton@... <mailto:philip.newton@...> >
Aan: shavian@... <mailto:shavian@...> <shavian@... <mailto:shavian@...> >
Datum: woensdag 17 oktober 2001 6:07
Onderwerp: Re: [shavian] Re: Help! I'm an American!
>On 15 Oct 01, at 22:47, Levi der Eroberer wrote:
>
>> If you know one of the three will be used, remember these rules to select which one
>> it is:
>> On: Use as the short "o" and in "ough"-combinations. (not, fought, cough)
>> Ah: Usually spelt "a" as in 'father' and 'yacht' (garage, llama)
>> Awe: Always use for 'au' and 'aw' combinations, 'al' or 'all' combinations, and the
>> words 'broad' and 'Broadway' (Australia, raw, altar, ball)
>
>"ough" for me takes the "awe" vowel. "caught", "court", and "fought"
>rhyme for me. (But "cough" rhymes with "off" and takes the "on" vowel.)
>
>And I say "Ostraylia" and not "Awestraylia". (And my "yacht" rhymes
>with "pot".)
>
>Cheers,
>Philip
>--
>Philip Newton <Philip.Newton@... <mailto:Philip.Newton@...> >
I guess as long as people know what you're saying you're all right. I can't make a difference between e and A, but I do know when to use which one. My A sounds like e (just a little longer in length, and I try to mix it with the Dutch 'aa' (lange a), a sound that doesn't occur in English).
I think the Dutch 'aa' is the IPA a.
My U turns in either a a or M.
But I write U, even if I don't pronounce it that way.
Yahoo! Groups Sponsor
ADVERTISEMENT
<http://rd.yahoo.com/M=213858.1650662.3186813.1261774/D=egroupweb/S=1705213030:HM/A=763352/R=0/*http://www.classmates.com/index.tf?s=5085>
<http://us.adserver.yahoo.com/l?M=213858.1650662.3186813.1261774/D=egroupmail/S=1705213030:HM/A=763352/rand=816463195>
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service <http://docs.yahoo.com/info/terms/> .
From: Vir Strakul
Date: 2001-10-25 01:36:06 #
Subject: [shavian] Learning Strategies
Toggle Shavian
I'm new to the Shavian language and am trying to learn it. For those
who are more experienced with this language, what strategy would you
recommend for learning it?
I am first trying to be able to read it and then I'll try to learn
how to write it. Is this a good strategy or should I first learn to
write it?
I know there are many ways to learn it but I want the easiest and
fastest way to do so. Any other advice will be greatly appreciated.
Thanks in advance,
Vir Strakul
------------------------ Yahoo! Groups Sponsor ---------------------~-->
FREE COLLEGE MONEY
CLICK HERE to search
600,000 scholarships!
http://us.click.yahoo.com/Pv4pGD/4m7CAA/ySSFAA/mx3olB/TM
---------------------------------------------------------------------~->
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/