Saturday, November 19, 2011

The Genealogy of Writing Systems

What do the surnames Staszak, Schöndorf, Chêneau and Jaroš have in common? Beside being in my family tree, they all share a common sound. Polish "sz", German "sch", French "ch" and Czech "š" are the same sound, the sound represented in English by the combination "sh" or in the International Phonetic Alphabet by the symbol "ʃ". Why are there so many ways of representing this sound in different languages?

To answer this question, we must understand that writing systems, like people, have genealogies. In this case, all these writing system have a common ancestor, Latin. The reason there are so many different ways of representing the "sh" sound is that this sound is not found in Latin. As the Latin alphabet came to be adopted for other languages, people came up with ways of representing sounds not found in Latin.

There are four ways to do this. The examples below show some of the many ways that the "sh" sound is represented.
1. Combine letters to represent new sounds (Polish "sz" German "sch" and English "sh")
2. Modify existing letters by adding extra symbols called diacritics (Czech "š" and Turkish "ş")
3. Use a letter in a new way (Maltese "x")
4. Create a new symbol or borrow one from a different writing system (I can't think of any examples for this sound, but there are cases for other sounds like Old Norse þ for the "th" sound in "thorn")
In many cases, especially with languages without a long written history, the new writing system was often created by an individual or a committee. However, in languages with long written histories, things are much more complicated. For example, French is an ancestor of Latin in two ways. It owes not only its alphabet to Latin, but the language itself. The "ch" which represents the "sh" sound developed from an earlier "k" sound, or hard "c". The development in this cases was more gradual and less deliberate.

Although we often assume a deep connection between a language and how it is written, this is perhaps the aspect of a language that is most easily changed. Turkey did away with the Arabic script in the first half of the twentieth century and adopted a modified Latin script. In some cases, a single language or two very closely related languages may adopt two different systems, like Serbian (Cyrillic script) and Croatian (Latin script) or Hindi (Devanagari script) and Urdu (Arabic script).

For genealogists, an awareness of these differences between scripts can come in handy. A single surname may be spelled in many ways depending on the context. For a German name starting with "sch" like Schulz, don't be surprised if it is rendered by an English speaker as Shulz. Things are especially complicated for a language like Czech. A name like Jaroš may be spelled Jarosch, as it frequently was when Bohemia was a part of the German-language-dominated Austrian Empire. In Poland, this name might become Jarosz and, in the English-speaking world, Jarosh. Knowing something about the history of your family and where they're from and the languages of the region can help you better search for your ancestors.


  1. And don't forget "s" in Hungarian - the name would be spelled Jaros, but pronounced Jarosh (and to confuse things further, sz for the plain old "s" sound).

  2. @Greta: Hungarian is strange indeed. It is the only case I can think of where the default pronunciation of "s" is not the typical dental or alveolar sibilant. It may have something to do with the relative frequency of the two sounds. "S", which represents English "sh" sound, is more common than "sz", which represents English "s" sound.