Shawalphabet YahooGroup Archive Browser
From: "Thomas Thurman" <tthurman@...>
Date: 2009-03-26 19:11:36 #
Subject: Corpora and alignment
Toggle Shavian
I am working on a tool which will take a file containing text in the Latin alphabet and another containing the same text in the Shavian alphabet, the second text having been produced by a human rather than a machine transliterator, and output a list of all words used mapping Latin to Shavian. It will have a sanity check so that if too many words in a row don't have consonants that match to some extent it will stop with a warning. There will also be a way to check which Shavian spellings have multiple Latin spellings associated with them and which Latin spellings have multiple Shavian spellings.
I plan to use this to produce a lexicon mapping Latin to Shavian spellings which can be used in automated transliteration, and also as a basic dictionary for humans. This will avoid the problems inherent in the major electronic pronouncing dictionaries currently available, especially the father-bother merger.
I am interested to know which texts I may use for this purpose. For example, I'd quite like to use http://www.mithrandir.com/Shavian/Documents/UTF8Pages/AChristmasCarol.html but it has a stern copyright notice at the top. (I haven't tried to contact the transliterator to ask about this yet.)
Does anyone know whether the texts in this group's file section are fair game for this sort of experiment? Can whatever copyright can legitimately be claimed in a transliteration extend to forbidding its use as a source text in this way?
Thomas
From: "Thomas Thurman" <tthurman@...>
Date: 2009-03-26 19:33:40 #
Subject: Re: Corpora and alignment
Toggle Shavian
--- In shawalphabet@yahoogroups.com, "Thomas Thurman" <tthurman@...> wrote:
> I plan to use this to produce a lexicon mapping Latin to Shavian spellings which can be used in automated transliteration, and also as a basic dictionary for humans.
By the way, it would look rather like "Andiwxd3.txt" in the files section, except with more words and with the source of each word marked. I might do up a PDF as well so it could be printed, for people who wanted a hard copy.
T
From: "paul vandenbrink" <vandenbrinkg@...>
Date: 2009-03-27 17:23:19 #
Subject: Re: keyword pronunciation
Toggle Shavian
Hi D.shep and Yah.ya
I have a problem with Whoop. I pronounce it differently in different contexts:
wh-oo-p-s with the oo-sound of good. for minor mishaps.
w-oo-p-s with the oo-sound of a long u for major collisions, splashes and other disasters, where a loud vocal expletive is appropriate.
I also use that second form (w-oo-p-s) in the compound war-whoops.
Does anyone else find interesting there is soft and hard form of oo
vowel sound, just like the 5 other basic Roman Vowel Letters?
Regards, Paul V.
P.S. I pronounce Paul to rhymn with All. It is not Powell
It is simply appalling.
_________________attached_________________________________
--- In shawalphabet@yahoogroups.com, dshep <dshepx@...> wrote:
> Well, I don't know. Perhaps you are right. Proper names do often preserve
> archaic pronunciations, as you say, but is this a disqualification? We shall
> have to ask Paul if he hears a difference between his name and the word
> 'poll'.
>
> > And also, once again, reiterate that IMO the distinction between 'w' and
> > 'wh' phonemes is one of aspiration only, not of voicing. Any change in
> > voicing in minimal pairs seems - IMHO! - irrelevant.
>
> Hmm. Yes, the difference is the presence or lack thereof of aspiration, which
> is enough to constitute a minimal pair, I would think. If it is aspirated it is by
> definition unvoiced, thereby contrasting with anything similar that is voiced.
> If there is a contrast in the initial element of two words that rhyme then you
> have a minimal pair. The only relevance is that they can be distinguished.
> Am I missing something?
>
> > Penultimately, let me ask you, how do you pronounce 'whoop'? It's a word we
> > only ever came across in print, e.g. as in 'war-whoops', but our (very old)
> > English teacher told us (47 years ago) it should be pronounced 'hoop', objecting
> > to us making it 'woop' or 'hwoop'.
>
> I have tried to think if I have ever had the opportunity to pronounce this
> word but cannot remember a single occasion. However, were I asked to
> do so I would say 'hwoop'. War-hoops sounds most unmartial and a bit
> ridiculous to me. Nor would I consider riding into town for a good time and
> 'hooping-it-up'. On the other hand there is the expression 'whoops and
> hollers' where hoops might be the natural choice.
>
> > Which reminded me: to say that someone comes from a totally unimportant little
> > backwater (you know, the kind of 'locality' marked on a map with a place name,
> > but only two or three houses to be seen in the vicinity - not even a 'hamlet'),
> > here in Australia we say they 'come from Woop-Woop'.
From: "paul vandenbrink" <vandenbrinkg@...>
Date: 2009-03-27 17:49:49 #
Subject: Re: keyword pronunciation
Toggle Shavian
Apologies to Philip.
Hey D.Shep
Good one.
Better Luck next time.
Regards, Paul V.
P.S. I was going out for a Laugh and a Joke (Cockney Slang)
but it was smoggy already, so what's the use.
____________________________attached_________________
--- In shawalphabet@yahoogroups.com, dshep <dshepx@...> wrote:
>
> Paul wrote:
>
> > P.S. Philip, I am mildly surprised, given that you are a strong
> > proponent of the Shavian Alphabet, that you concern yourself
> > about only being thought somewhat odd. ...
>
> Paul -- joke, joke!!
>
> And, I doubt Philip would wish to be burdened with being responsible
> for my attempts at levity.
>
> ever odd,
> dshep
>
From: "Thomas Thurman" <tthurman@...>
Date: 2009-03-29 02:55:28 #
Subject: Alignment
Toggle Shavian
I have finished the first version of the tool I mentioned previously which aligns Shavian and Latin versions of the same document. I have fed it Scott Harrison's version of "A Christmas Carol" and the US Constitution and it has produced a lexicon of 4,818 unique words. Including the words in "Androcles" brings us up to 6,012 unique words. (By comparison, CMUDict has 133,827 words and Moby has 177,267.)
I imagine that if I can find works of similar length to "A Christmas Carol" to feed it, the size will continue to increase; most of the new words will be rare and the basic words are mostly words we have already.
Suggestions for further exploration are welcome; I will upload the script and the results if anyone's interested.
T
From: "Thomas Thurman" <tthurman@...>
Date: 2009-03-29 02:56:47 #
Subject: comic
Toggle Shavian
Did anyone else see this?
http://www.moderntales.com/comics/almamater.php?name=almamater&view=single&ID=17495
T
From: Star Raven <celestraof12worlds@...>
Date: 2009-03-29 12:43:15 #
Subject: Re: [shawalphabet] comic
Toggle Shavian
Hey neat, though it does highlight the futility of our work. *le sigh*
--Star
=========The probability that a book is good exponentially decreases with respect to the number of words the author has made up.
--xkcd comics
My LJ: http://wodentoad.livejournal.com
Or: http://happyhousewyf.livejournal.com
________________________________
From: Thomas Thurman <tthurman@...>
To: shawalphabet@yahoogroups.com
Sent: Saturday, March 28, 2009 10:56:46 PM
Subject: [shawalphabet] comic
Did anyone else see this?
http://www.modernta les.com/comics/ almamater. php?name= almamater& view=single& ID495
T
From: dshep <dshepx@...>
Date: 2009-03-30 01:07:38 #
Subject: re: comic
Toggle Shavian
Well, if we are an object of fun then we can't be entirely obscure.
obscurely,
dshep
From: dshep <dshepx@...>
Date: 2009-03-30 01:23:14 #
Subject: re: keyword pronunciation
Toggle Shavian
Paul wrote:
> P.S. I pronounce Paul to rhymn with All.
> It is not Powell, It is simply appalling.
Yes but, does your 'all' rhyme with (American) football (ºwl),
or Jean-Paul as in Quebec (almost = poll or pole)?
Incidentally, the novelist Anthony Powell, author of the well-
received "A Dance to the Music of Time" some years back,
insisted that his name, Welsh in origin, should be pronounced
to rhyme with 'pole', not 'towel'.
pollingly,
dshep
From: "Thomas Thurman" <tthurman@...>
Date: 2009-04-01 03:50:41 #
Subject: A bialphabetic experiment
Toggle Shavian
I have created a wiki at http://shavian.marnanel.org/ .
It uses the same software as Wikipedia, but with one custom extension. Pages whose name begins "Document:" are automatically transliterated on the fly into the Shavian alphabet, by looking up the words in the wiki itself. Examples:
http://shavian.marnanel.org/index.php/Document:When_I_was_one_and_twenty
http://shavian.marnanel.org/index.php/Document:Androcles_and_the_Lion/Act_I
Red words which aren't in the lexicon can be added, and blue words which are can be altered, simply by clicking on them. Editing the page will reveal that it's still in the Latin alphabet underneath.
I would be happy and honoured if a few people could come and build this up with me. Every word added to any document is automatically transliterated in every other, so the more public domain documents we copy from Wikisource and elsewhere the more the lexicon will learn and the larger a library we will have.
I'm not publicising this other than here yet, since it's very experimental. I'm looking forward to seeing how far it can go.
Thomas