7
  1. First, I don't speak/understand any so-called agglutinative languages, like Turkish. I also don't know German.

  2. I understand there's no good definition for the concept of "word", which could apply to all languages. But for the sake of this questions let's assume we define word as something independent of writing.

  3. I am always a bit skeptical about some languages being able to construct very long words. This skepticism come from the fact that I've seen such claims about languages I know well. For example: many times I have heard and read info like "English words are much longer than Chinese and Vietnamese words" while in my opinion these claims are based on the biased bracketing of the syllables and morphemes. For example a word like: "unbreakable" translates to Vietnamese "không thể phá vỡ", for some reason "không thể phá vỡ" is never regarded as a word neither by Vietnamese nor by English speakers. I would not consider "unbreakable" a single word any more than "không thể phá vỡ".

Now I think similar "tricks" are made with German and Turkish long words. That is I believe that the Turkish "muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine" and the German "Kraftfahrzeug-Haftpflichtversicherung" are just bunches of words. It just happens that spaces are not put inside these bunches.

The only doubt I have about my example is that "không" is actually a single word that can be uttered alone, while the English prefix "un" is not.

Could anyone break down these long words to shorter words to show it's possible or contradict my intuition-based claims otherwise?

GA1
  • 1,179
  • 1
  • 11
  • 21
  • 31
    I don't think you'd find many/any native English speakers who wouldn't consider "unbreakable" to be a single word. – Cairnarvon Jan 15 '21 at 23:16
  • 11
    Un- is a prefix and -able is a suffix. One of the best definitions of ‘word’ that I’ve come across is that it can be meaningfully uttered in isolation and used to fill a slot in the syntax of the language in question. That’s true of break, but not un- or -able (which is different from able). In the German example, it works fine for Kraft, Fahrzeug, Pflicht, etc.; those are all words, and the whole thing is a compound. But why shouldn’t (German) compounds be words? They can fill a syntactic slot, and cases are added at the end of the whole thing, not per element. – Janus Bahs Jacquet Jan 15 '21 at 23:27
  • 6
    English itself has many single "words" that could be analyzed as two separate nouns: "innkeeper", "sailboat", "football", "bookshelf", etc – chepner Jan 16 '21 at 13:59
  • @JanusBahsJacquet: a nominal such as nice touch meets both criteria: it can be meaningfully uttered in isolation, and it can fill a slot "in the syntax" (although I'm not sure that I understand what you mean by that): the slot in This is a ....... can be filled by nice touch. So, it's a word? – Schmuddi Jan 16 '21 at 13:59
  • 1
    @Schmuddi The term is deliberately vague because slots differ so much between languages, but I mean what you might call a ‘non-reducible’ slot. So for English, it wouldn’t be a VP or NP, which may both be subdivided into head, modifiers, complements, etc., but a slot that cannot be further subdivided, the ‘bottom layer’ in a classic syntax tree. (Talking basic, linguistics-101 analysis here – some theoretical frameworks will subdivide in a sentence into umpteen layers of roles and carriers and what-have-yous, which isn’t what I mean.) – Janus Bahs Jacquet Jan 16 '21 at 14:10
  • 7
    That dash in Kraftfahrzeughaftpflichtversicherung just hurts. It doesn't belong there. Yes, with the most recent spelling rules it's allowed to add dashes for 'better readability' (by people who don't read well, if you ask me), but still. – Nobody Jan 16 '21 at 18:53
  • 2
    This question is just the linguist version of 'is a taco a sandwich, is cereal soup? '. – eps Jan 17 '21 at 00:26
  • 7
    A german native speaker would consider Kraftfahrzeughaftpflichtversicherung to be a single word. It doesn't really matter if you don't think so. Also, scrabble is played with words that can be found in a dictionary. Kraftfahrzeughaftpflichtversicherung can be found in a german dictionary. – Polygnome Jan 17 '21 at 01:12
  • Got it @Cairnarvon , maybe unbreakable is not the best example? But what about impossible, unimaginable, inedible etc.? Some of similar constructs would be considered a word by the majority of English native speakers, wouldn't they? – GA1 Jan 17 '21 at 13:05
  • @eps sure it is. We we will not find any meaningful definition for word here, but I hope this real world examples will help me and other people understand the problems behind it. – GA1 Jan 17 '21 at 13:07
  • Why such a short word for the German example instead of the meanwhile famous Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz? – Hagen von Eitzen Jan 17 '21 at 13:28
  • 4
    Long German words are compounds. If English worked the way German does, then in the U.S.A. the document celebrated on the 4th of July would be the Independencedeclaration. – Michael Hardy Jan 17 '21 at 18:21
  • 3
    @MichaelHardy Stackposters who post helpful Languageinformationcomments deserve Upvotemoderationtreatment. – Robert Columbia Jan 17 '21 at 23:06
  • English has a tendency to create compound words, just like German, when they are used long (I mean decades and centuries) and frequently enough, often going through a phase where they are spelled with a hyphen. German just does it from the start and goes to the extreme. (Dutch speaker here; Dutch has exactly the same tendency to create compound words just like German). – Amedee Van Gasse Jan 18 '21 at 08:44
  • 2
    @RobertColumbia sounds a bit like how software developers write variable names. – Amedee Van Gasse Jan 18 '21 at 08:45
  • 2

    But for the sake of this questions let's assume we define word as something independent of writing.

    This isn't a workable definition of word. Tables exist independent of writing, but a table is not a word. If this is to be a question about the nature of Turkish and German then I think it needs to be edited to include a more complete definition of 'word'. It's up to the questioner to choose the definition according to what they want to learn about the languages.

    – bdsl Jan 18 '21 at 08:55
  • @HagenvonEitzen Because it's a way better example. Although almost always abbreviated to Kfz-Haftpflichtversicherung, every german who has a car also has such an insurance, it's a commonly known word. Your example is obscure. – kutschkem Jan 18 '21 at 08:56
  • 3
    note that english has the exact same issue (as turkish) in words like "antidisestablishmentarianism" – somebody Jan 18 '21 at 10:49
  • ithinkyoullfindthatthisverycommenthereisthelongestwordinenglishasitsevenlongerthanwordslikeantidisestablishmentarianismandpneumoultramicroscopicsilcovolcanoconiosis – user253751 Jan 18 '21 at 17:17
  • 1
    Note that in German there are two plurar forms of "word": "Worte" and "Wörter". The latter covers the grammatical entity of something consisting of letters and fulfilling a role in a sentence -- the former represents a "thought", like "Schimpfworte" (swear words). – ljrk Jan 18 '21 at 21:17
  • @GA1: German has a different concept on the combination of words. In English, I can say "car insurance" and you automatically bridge the mental gap that "it's an insurance somehow concerning cars". But just because it contains a space doesn't mean you can take any part away and retain the meaning of the whole. It's similar in German, but we take away the spaces. This might've been influenced by a more complex grammar (e.g. cases), which affects the compound as a whole. English is a language of phrases of many small words; German is a language of few but precise, though more complex, words. – hoffmale Jan 18 '21 at 21:42
  • Yes, you can say "motor vehicle liability insurance" (literal translation of Kraftfahrzeughaftpflichtversicherung) without spaces is a "trick", but how else would you describe that exact concept? And since you cannot splice it apart, might as well make it count as one integral part of the sentence (for less confusion). – hoffmale Jan 18 '21 at 21:48
  • @hoffmale But "motor vehicle" and "liability insurance" are words in their own right that retain their meanings in this case. It's a compound word, like the English "innkeeper". – Loren Pechtel Jan 19 '21 at 03:30
  • @LorenPechtel It's a compound consisting of compounds. Yes, "motor vehicle" and "liability insurance" are the major building blocks. Yes, they are compounds in their own right. But: Does a "motor vehicle liability insurance" cover any and all liabilities? No, only those arising from (common) usage of a motor vehicle (whether that be a car, a truck, a zamboni, ...). So it's a specialized kind of liability insurance, and taking any part away detracts from that meaning. Just like a bookshelf is a special kind of shelf, or innkeeper/beekeeper a special kind of keeper. – hoffmale Jan 19 '21 at 07:28
  • 2
    I would like to add that in some language it matters if a concept is written in one word or in two. In my mother tounge, which is the northern german language Norwegian, there can be misunderstandings (or great humor) if a compound word is written as two words.

    E.g. feil/melding = error message vs. wrong message lamme/lår = leg of mutton vs. paralyzed legs

    There has been published books about this (funny little books you might give your relatives for Christmas) with pictures to show the difference, like these replicated on the website https://norsksidene.no/web/PageND.aspx?id=99593

    – tetra Jan 20 '21 at 23:41

4 Answers4

43

From the perspective of linguistics, the question is meaningless though well-intentioned. "Word" is not a well-defined technical concept in linguistics (or, some people may have concocted a definition of "word" for their purposes, but there isn't even a widely-believed definition). The best definition is "a maximal string of letters not containing spaces", and that's really not very good (not all writing systems use spaces, plus that means that unwritten languages don't have words).

One approach has been to equate "word" with "syntactic terminal", which simply swaps the question to "what is a syntactic terminal?". In contemporary Minimalist syntax, verbs especially are composed of many many nodes which still are realized as "a word" in a language like Latin (with tense, aspect, person and number agreements each of which contributes a syntactic node). There are also phonological accounts which appeal (circularly) to some property of assumed words (stress on the penultimate syllable of the word; a requirement to have at least 2 syllables in the word).

It seems to be true that anything that a naive speaker of a language can utter alone is at least a word, therefore an English speaker can't say "gira" which is a sub-part of "giraffe" and a Saami speaker can't say "beatna" which is a sub-part of [beatnag-a] "dog (acc. sg)". Clearly, not every utterance is a word. If you define "word" as a minimal utterance, then you would exclude the German and Turkish examples as "not words" (they are not minimal). But then "oxen" and "cats" are not words, because then contain words – "ox, cat". And that just seems wrong.

There does appear to be a phonological object, a grouping of many syllables into a thing, where rules apply within that thing, or with reference to the thing ("no obstruents at the end of the ___"), which we call the phonological word, or ω. This thing isn't "defined", it is or may be constructed, with language-specific rules. When you find that a large sub-part of an "utterance" has a certain kind of coherence w.r.t. phonological rules, you can call that unit a "(P-)word". But it turns out that this ω thing is not always coherent in a language, and clitics can present contradictory evidence where they partially act like they are "in the word" and partially act like they are outside.

Hence, most linguists have abandoned the concept of "word" as a coherent technical concept.

user6726
  • 83,066
  • 4
  • 63
  • 181
  • 7
    This is a good answer, except for the puzzling claim that the definition of a word as a string of letters with no spaces is the best one. There’s no satisfactory definition, agreed, but that one has to be one of the worst ones: most have more or less peripheral problems that depend on more or less arbitrary categorisations of different aspects of language, but this one is perhaps the most arbitrary of all and excludes the vast majority of all words in its very definition. – Janus Bahs Jacquet Jan 16 '21 at 01:55
  • 14
    But that is how most people understand "word", so if you ask a random person "how many words in this test", they will count blank-delimited sequences (if it's not an Arabic / Chines / Thai speaker). It correlates best to the procedure that ordinary people use for identifying words. I'm not at all suggesting that "word" is a valid technical concept, so my metric of goodness relates to understanding this strange social object that people talk about. – user6726 Jan 16 '21 at 02:05
  • 6
    True, most people will use it as a basic beginning point in identifying words if faced with a written text (until you ask them why classmate is a word and soul mate isn’t); but even the most average of random speakers would be readily able to recognise words as spoken units, which this definition excludes completely. Nobody would say you have to write it down for it to become a word. – Janus Bahs Jacquet Jan 16 '21 at 02:14
  • 1
    @JanusBahsJacquet Yet you do have to write it down mentally to answer the question whether soul mate/soulmate is one or two words in the common understanding of the concept 'word'. It sounds perfectly sensible to define this common understanding of the concept as a factor of spelling as the German language beautifully shows. The observation that languages without a writing system and illiterate people would have a less strict/non-existent understanding sounds likely/reasonable. – David Mulder Jan 16 '21 at 23:41
  • 2
    @DavidMulder It’s difficult for us to imagine what it’s like not to know writing, but I certainly don’t have to mentally write down anything to know that soul mate and classmate are both words (regardless of the spaces). We do know for a fact that the concept of a word predates written language, perhaps not universally, but in many languages; and apart from actual linguistic analysis (which largely eschews the topic altogether), there’s no reason to think speakers of spoken-only languages’ idea of a ‘word’ would be any less intuitive than ours. – Janus Bahs Jacquet Jan 16 '21 at 23:50
  • @user6726 Just out of curisoity, have you ever played Scrabble? if so, how did you manage to play any word if words don't exist? how can anyone play Scrabble if the concept of a "word" is non-sensical? the fact that scrabble can be played in a myriad of languages seems to suggest that a "word" is some kind of sensical unit in a lot of cases. – Polygnome Jan 17 '21 at 01:15
  • 6
    @Polygnome Whether a string of letters/symbols is considered a valid word in the game of Scrabble is determined by whether said string is included in some list agreed upon by the participating players. (Commonly used word list references for the English language are "NASPA", "Collins", etc.) However, that doesn't necessarily mean words are well-defined as a linguistic or technical concept. It might just as well mean that some "authority figure" by the name of "Collin" tossed a coin every time he had to decide whether a queried string should be deemed worthy of inclusion in his list. – Will Jan 17 '21 at 03:33
  • 1
    Requiring spaces is anachronistic. Most early texts didn't have spaces as word divisions (and few word divisions at all). Some languages (like Thai) still don't. It's an extremely modern, western, Latin-alphabet-based position that, agreeing with @JanusBahsJacquet, is the worst definition of what a word is. – cmw Jan 29 '23 at 03:53
14

In German, noun phrases that are used to describe a separate entity other than their individual nouns are written without spaces. Thus, the example of Kraftfahrzeug-Haftpflichtversicherung may indeed be considered as a "bunch of words" in the sense you have described. In Turkish however, this is not true. The example in Turkish you have provided contains only a single "word" which is muvaffakiyet, meaning success. All of the remaining parts of that word are suffixes. They are used to modify the meaning, add the notion of time etc.

For example:

  • Muvaffakiyet = Success
  • Muvaffakiyetsiz = Unsuccessful
  • Muvaffakiyetsizleş (-mek) = to become unsuccessful and so on.

If you check the last part, you might think that the last added part "-leş" means "to become". But that would be incorrect. It is just a suffix and is meaningless on its own. So your example in Turkish is not a "bunch of words" written without spaces since except for the first one, muvaffakiyet, none of the components have a meaning on its own.

On the other hand, your German example, Kraftfahrzeug-Haftpflichtversicherung, translated into Turkish would be "motorlu araç sorumluluk sigortası". As you can see, that is written with spaces as each of the individual parts are not suffixes, but nouns, meaning that they have a meaning on their own. (Motor (-lu) = (with/containing) motor, araç = vehicle, sorumluluk = liability, sigorta (-sı) = insurance (of))

talkanat
  • 251
  • 1
  • 3
  • 2
    Your example is continued here at the bottom: https://en.wikipedia.org/wiki/Longest_word_in_Turkish – aytunch Jan 16 '21 at 19:31
  • 1
    @talkanat. Sue, leş can't stand alone in Turkish, but so are a and the in English and các, những in Vietnamese and and in Chinese. but somehow all these 6 morphemes are never agglutinated into words in English, Vietnamese and Chinese, thus making the words of these languages shorter. Can we imagine a different writing convention for Turkish words? Is the problem with the number of suffixes in Turkish? – GA1 Jan 17 '21 at 13:12
  • @GA1, as a native speaker of Turkish, I can say that, writing those suffixes separated from the stem would be writing the English word unsuccessful as un success ful. It would be impractical to separate those suffixes of the words by spaces, perhaps because of the number of suffixes in Turkish as you have pointed out.. Also, note that, the vowel of suffixes are modified in most Turkish words. For example, suffix -leş in your example would be -laş for words ending in, for example the letter ı. Example: Başarı-sız-laş* (-mak)*. – talkanat Jan 17 '21 at 20:02
  • Also note that "muvaffakiyet" is not a "word" either but made by adding the Arabic prefix "mu" and suffix "iyet" to the Arabic word "vafk/vefk". Ironically all those additions make the word longer but the final meaning is the same as the root word (muvaffakiyet = vafk). – Selcuk Jan 18 '21 at 05:43
  • @Selcuk, there are no prefixes in Turkish language. And any loan word from another language that is used in Turkish is considered a word as a whole with its suffix/prefixes. That is why I denoted muvaffakiyet as the "stem" here. But considering its Arabic root, you are of course correct. – talkanat Jan 18 '21 at 17:01
  • To be clear I wasn't implying that your answer is incomplete and/or incorrect but pointing out a fact that I find interesting. It is certainly a root word as long as Turkish language is concerned. – Selcuk Jan 18 '21 at 23:09
4

In German these compound terms are being called "Komposita" (which is the plural form of "Kompositum").. As per your example "Kraftfahrzeug-Haftpflichtversicherung", this could as well be expressed as a "Haftpflichtversicherung für ein Kraftfahrzeug". In order to merge such even more lengthy descriptions into a single term (without inventing a new one), compound terms are often being formed. The term "Kraftfahrzeug-Haftpflichtversicherung" can be split up into 5 words:
"Kraft" (power), "Fahrzeug" (vehicle), "Haft(ung)" (liability), "Pflicht" (duty), "Versicherung" (insurance).

The term "Donaudampfschifffahrtsgesellschaft" is probably still the most promeniert, due to the extraordinary tripple f (which only may occur in such a compound term, but nowhere else).

  • 13
    If you check the discussion here carefully, you'll see that something being splittable into multiple words does not necessarily make the compound "not a word". English has many such examples, like butterfly or classmate, which by your logic should be "obviously" not considered words. – Norrius Jan 16 '21 at 20:17
  • 2
    The English term for a Kompositum is ‘compound (word)’. – Janus Bahs Jacquet Jan 16 '21 at 22:16
  • 2
    Technically, Kraftfahrzeug-Haftpflichtversicherung can be split up into 6 words, because the word Fahrzeug can be split up into Fahr and Zeug ;) – trainman261 Jan 18 '21 at 00:38
  • Donaudampfschifffahrtsgesellschaft was already a very prominent example (most of the time used with an addidional Kapitän at the end) long before the orthography reform of 1996 introduced the third f. – Bill Tür stands with Ukraine Jan 18 '21 at 09:23
  • @Norrius I would say it's a compound word if it can be split and the result would be interpreted the same. "Butter" "fly" would be a variety of fly, not a beautiful big-winged insect. Thus it's not a compound. "Class" "mate", however, retains the same meaning, I would call it a compound. And a carrot isn't a decaying motor vehicle, it's not a compound word. – Loren Pechtel Jan 19 '21 at 03:38
  • My kids inform me that it's "Donaukanaldampfschifffahrtsgesellschaftkapitänswitwenrente" (Donau canal steamship company captains's widow's pension). And indeed we now have a Rhein-Main-Donaukanal, and you could use that if you want. – RedSonja Oct 11 '22 at 13:03
0

There are double leş-tir suffix in muvaffakiyetsizleştiricileştir- so it is non-sense. If you want you may add more suffixes like this to any word and you may reach a gibberish string sounds like Turkish.