13

J, U, W are included in ISO basic Latin alphabet which consists of 26 letters. However,

  • The classic Latin has only 23 letters, and J was only used as a variant of I as σ do to ς.
  • J, U were not distinguished from I, V in Europe until late Medieval, and were not regard as different letters as late as 18th Century in English.
  • Many Europe languages do not have a J or V(sometimes U instead) in their alphabet.
  • W, first the digraph UU or VV then ligature as implied in its name, and not included in many European language alphabets, was included in ISO basic Latin alphabet.
  • Meanwhile, Æ and Œ which are also common no only even in English but also in Medieval Latin and other Europe languages, however, didn't survive in the ISO basic Latin alphabet.
  • Ch, a digraph dated back to 2 Century BC, was included in Gerke's version of Morse code and came to standard by ITU (as do Ä, Ö, Ü), didn't survive, too.

Why were J, U, W included? Is it just a coincidence that English is the only major language that used all these letters and no more in its orthography?

Related: Does any language using the Latin alphabet have a unique name for "w"?

Schezuk
  • 293
  • 3
  • 8
  • 42
    The ISO was founded in the 20th century and seeks to address contemporary needs. It's not especially concerned with how useful medieval scribes would find its standards. – Cairnarvon Feb 24 '21 at 14:02
  • 1
    separate questions should be asked as separate questions, as such I have removed your additional question (why X has a capital form despite appearing word-initially in no native vocabulary) – Tristan Feb 24 '21 at 16:06
  • 1
    to answer it though, the long story is that capital letters are just the old carved forms of the letters, whilst lower case letters are forms specialised to being written by hand. Texts written with a mix of upper and lower case letters are relatively recent and the conventions around it only really became settled after the introduction of the printing press and standardised letter shapes. As X was used in classical (all capital) Roman inscriptions there is an obvious majuscule form for the letter, so it's natural it was adopted for initial-caps loanwords and acronyms – Tristan Feb 24 '21 at 16:09
  • 4
    Is there any Latin-script-based orthography without u? And any in Europe besides Italian and Sardinian and so on without j? – Adam Bittlingmayer Feb 24 '21 at 17:42
  • @AdamBittlingmayer I can't think of any modern European ones lacking U, but e.g. four-vowel systems /i e a o/ are not uncommon in Mesoamerica so those languages tend to either use U for /w/ or dispense with it. (That said, of course, ASCII was not designed with Mesoamerican romanizations in mind.) – Draconis Feb 24 '21 at 17:49
  • 3
    Don't forget that the original Latin alphabet also lacked , the letter was read as both [k] and [g]. was introduced by freedman Spurius Carvilius Ruga, the first Roman to open a fee-paying school, who taught around 230 BCE: https://en.wikipedia.org/wiki/G – Yellow Sky Feb 24 '21 at 18:31
  • @AdamBittlingmayer - Welsh and Irish Gaelic (and I guess Scottish Gaelic and Manx Gaelic, too) don't use j in native words, they use this letter exclusively in borrowings from English which keep their English pronunciation, like “jazz” or “jeep”. In fact, Italian does also use j in those words borrowed from English. – Yellow Sky Feb 25 '21 at 05:35
  • @YellowSky Italian even uses j in words borrowed from Italian (mostly placenames and family names). – Adam Bittlingmayer Feb 25 '21 at 06:50
  • @YellowSky Manx orthography is based on English, and differs, among many other things, from Scottish and Irish by using j instead d to represent /dʲ/ (that is, slender /d/) in native words as well; so ‘end’ is jerrey /ˈdʲerə/ in Manx, while it’s deireadh in both Irish and Scottish (representing /ˈdʲeɾʲə(x) ~ ˈdʲeɾʲu/ in Irish; in Scottish, the final syllable varies a lot, between /ə ~ əɣ ~ ək ~ əv ~ ʊ/ or nothing at all). – Janus Bahs Jacquet Feb 25 '21 at 14:45
  • 8
    Latin in that name does not reference the Latin language, but rather the Latin script, as opposed to Arabic, Hebrew, Cyrillic, the various Indian scripts, Thai, the various ideogram-based scripts, and many many others. – jcaron Feb 25 '21 at 17:37
  • @jcaron This. The question is based on an (understandable ;-) ) misunderstanding of the word Latin. It's "Latin" only in the way our numbers are "Arabic". – Peter - Reinstate Monica Feb 25 '21 at 18:14
  • Why would you include K and Z? Aren't those Greek letters that were not used in classical Latin? – Michael Hardy Feb 25 '21 at 19:11
  • 2
    @MichaelHardy They're rare but attested in Classical Latin, in words like kalendae or zōna. Same for Y. – Draconis Feb 25 '21 at 19:19
  • @Cairnarvon you'd think that would be obvious. – RonJohn Feb 27 '21 at 19:56
  • I think your second bullet point meant to use "I, V" instead of "I, J"? – Paŭlo Ebermann Feb 27 '21 at 20:18
  • @PaŭloEbermann Thank you for pointing out! I have little knowledge of Medieval calligraphy, but V was adopted in Trojan carvings. So I and V versus J and U, I guess. – Schezuk Feb 28 '21 at 01:15
  • Ultimately because J, U, and W, unlike Æ (Ä) or Œ (Ö), are not representable by way of diacritics, and the digraph CH, unlike W, Æ, and Œ, is not a ligature. – Lucian Apr 26 '21 at 12:15

2 Answers2

36

Despite its name, the ISO Basic Latin Alphabet isn't particularly concerned with representing Latin. It was developed in the modern day, so the fact that I~J and U~V weren't consistently distinguished until the 18th century isn't relevant—they're consistently distinguished now.

But the observation that the ISO Basic Latin Alphabet aligns exactly with what's needed for English and not with what's needed for most other European languages is an important one, and gets at the core of the answer.

A lot of early work in electronic transmission of text was done in America, and as such, the early codes used were designed pretty much exclusively for English. It's the same reason why American varieties of Morse code didn't have codes for ß and ø, and why American typewriters didn't have keys for them: they just weren't needed for English, and including them was an additional expense for not much benefit.

In the 60s, American manufacturers standardized "ASCII" (the American Standard Code for Information Interchange) to make it easier for their devices to talk to each other—without any particular consideration given to other languages, for the same reason as with typewriters and telegraphs. And due to the significant influence of American tech manufacturers, the original seven-bit ASCII eventually got enshrined in international standards; variations like eight-bit ASCII and eventually Unicode tended to extend it, not modify the core of it, with non-English letters like ß and ø relegated to higher codepoints separate from the English alphabet.

And thus, the "ISO Basic Latin Alphabet" is just a fancy name for the English alphabet, circa the 1960s and 1970s when these standards were first devised. It's a historical accident, really, nothing more.

Draconis
  • 65,972
  • 3
  • 141
  • 215
  • 4
    You could argue that Basic Latin doesn’t even cater to English as such, but specifically to American English, excluding codes for variant forms primarily found in British English such as œconomics, mediæval, façade, café (in increasing order of usage). – Janus Bahs Jacquet Feb 24 '21 at 17:37
  • 4
    @JanusBahsJacquet True, though even in British English I'm more used to seeing oe and ae than œ and æ. I wonder if there was a measurable decline in the non-ASCII variants in published works when electronic communication caught on (and thus people saw and used the ASCII ones more)? – Draconis Feb 24 '21 at 17:45
  • 3
    I’d definitely say œ and especially æ were more common in the ’50s than they are now, but they’ve been in steady decline for well over a century. I’m sure the onset of electronic communication had some measurable effect, but I would guess it’s less significant than the general decline that was already underway. – Janus Bahs Jacquet Feb 24 '21 at 21:42
  • 2
    @JanusBahsJacquet Technically, æ is merely a graphical flourish (a ligature) for the characters ae. – chrylis -cautiouslyoptimistic- Feb 25 '21 at 00:11
  • 8
    @chrylis-cautiouslyoptimistic- In English, the ligatures are non-canonical and can always be substituted for <ae, oe>. In other languages, they are canonical and not equivalent to the sequences (cf. Danish aer ‘pets, strokes’ vs ær ‘maple tree’). Even in English, not all cases of <ae, oe> can be written <æ, œ>, so the two are not bidirectionally equivalent (e.g., fœtus, though unetymological, is still seen, but *gœs and *dœs are utterly unknown in Modern English). And as the question mentions, is just a ligature of or . – Janus Bahs Jacquet Feb 25 '21 at 00:21
  • 1
    @JanusBahsJacquet just wait until everyone gœs ahead and dœs exactly that, for a false sense of æsthetics – user253751 Feb 25 '21 at 09:41
  • 3
    As a slightly pedantic nit-pick, there is no "eight-bit ASCII"; there are, as you say, lots of different standards that extend or provide compatibility with ASCII. The most common are the ISO-8859 family, along with vendor-specific encodings like Windows 1252. – IMSoP Feb 25 '21 at 11:01
  • 8
    @JanusBahsJacquet: The claim about w being a ligature doesn't seem like it would have emerged in a vacwm. – supercat Feb 25 '21 at 14:25
  • 2
    The use of an alphabet that includes j, u, and w, but excludes some other characters goes back way before ASCII. Baudot code also used such an alphabet, as did many typewriters. – supercat Feb 25 '21 at 14:26
  • @supercat However, original Continental Baudot code included É, too. – Schezuk Feb 26 '21 at 02:58
  • @Schezuk: How about Morse Code, which I should have mentioned as an even earlier means of transmitting text? – supercat Feb 27 '21 at 17:18
  • @supercat The original Morse code, or American Morse or Railroad code, saw no adoption beyond North America. The Continental or International Morse code was largely based on Gerke's Hamburg alphabet which included Umlaut letters and digraph Ch but did NOT distinguish I and J, which was supplemented from Steinheil's code. The ITU adopted even more accented letters in their conventions(say 1868), and you can still see the E Acute inherited by recent ITU-R recommendations. Non-English morse code outnumbered letters in English but shared less codepoints. – Schezuk Feb 28 '21 at 01:57
11

Is it just a coincidence that English is the only major language that used all these letters and no more in its orthography?

Is it a coincidence? No.

But that's not the right test, because there are other letters that are not core to the orthographies of all major languages. (k, y, x and q, for example).

A more consistent test then would be if the letter is used in the orthographies of multiple major languages. And on that, j, u and w certainly qualify.

enter image description here

A map of pronunciations of j, and thus implicitly of where j is used. In fact, j does have a pronunciation in Italian, where it is still used in family names and placenames, and pronounced similary to i in the modern orthography.

Adam Bittlingmayer
  • 7,664
  • 25
  • 40
  • 1
    Nice map. Do you have that for more letters? – Joachim W Feb 25 '21 at 14:21
  • 1
    It was just stolen from Wikipedia. Looks like there is one for u too, and also for c. – Adam Bittlingmayer Feb 25 '21 at 20:07
  • 3
    I believe that Italy should be blue too, since the primarily pronunciation for J in Italian is /j/. P.S. Interestinly that most of the Latin words which had the J are pronounced with /dʒ/ (like in English), but are written (for the sake of simplicity) with G instead, e.g. maggiore (major), giugno (June), giustizia (justice) and even Gesù (Jesus). – trolley813 Feb 26 '21 at 06:34
  • 1
    @trolley813 I agree, but I would point out that most of those Italian words in Italian are considered loans from local languages where j is more core, so it could be better for the map to show those languages. – Adam Bittlingmayer Mar 01 '21 at 07:50
  • Hi, can you give the source of the picture? Or some source where I can find similar colorized maps based on native pronunciations? – Sanjit Jena Mar 01 '21 at 10:39