How many words can be considered "core words"?

Question

First of all I apologize but my English skills are by far below the complexity of the question I need to ask. I am not a specialist and my question is not related to a single language. I would like to identify the subset of "words" or maybe better, of concepts, that once learned or labeled, for their semantic or logical relevance, are "enough" to create a meaningful description of all the other words in a language. If we call these words or concepts primitives (archetypes), how many of them do we need in a basic dictionary? They basically allow a language to becomes self explicative. I found a question about circular defined words, but that question seems more based on a statistical evaluation of words in a dictionary. I would like to do something similar but to focus on the semantic aspect, on concepts, and then extract the words. This could allow to avoid the "minimum feedback arc set problem" coming from the attempt to do a kind of reverse engineering of the dictionary. Thank you in advance.

There is no such measure, because it is a judgement of probability that can't be tested. Essentially it is a matter of opinion that every dictionary editor has to decide on individually. You might survey successful dictionaries of various languages and see what their average word count is, if that's any help. — jlawler, Mar 05 '17 at 14:22
Thank you for your answer. I am trying to figure this out, because I think there are "primitives" in every language. In the beginnign you need a direct association between the object, the action, or the relationship (boolean operator for example), but at some point, you can go abstract and use these words to explain everything else. I think that you need a very small dictionary, then you can use sentences to express more complex meanings, and with knowledge you can then summarize these sentences in a single, more complex word. Does this make any sense? — Lorenzo, Mar 05 '17 at 18:03
You are not the first to believe that there are "primitives" in every language. But opinions vary about whether every language has its own set, like phonemes, or whether there is some Ur-set that underlies all languages; the latter is much harder to demonstrate, of course. Perhaps a book like Frawley's Linguistic Semantics or _ Foley's Anthropological Linguistics -- both of which go over all the semantic categories known to be used by human languages -- might help you. — jlawler, Mar 05 '17 at 18:11
Thank you very very much! I will search and read these books (I hope I can understand them :D ), for sure. Probably every language has its own subset of primitives, but many of them are common. All the primitives that are not relative (< is always < can't be = or >) must be common, I would call them absolute primitives. There are then primitives that can be relative, such as hard, cold, hot, small, big, and their relevance is different in many languages. I will go on with my search and I will try to analyze a few dictionaries. Thank you again! — Lorenzo, Mar 05 '17 at 18:25
The Natural Semantic Metalanguage approach to semantics developed by Anna Wierzbicka holds that there is a set of semantic primitives, found in every language, which are sufficient to define any word but are not themselves definable. Further information,including on methodology, here. — Gaston Ümlaut, Mar 06 '17 at 04:33
Possible duplicate of dictionary with the smallest number of circularly defined words — Sir Cornflakes, Mar 06 '17 at 09:44
Another related question: http://linguistics.stackexchange.com/questions/20960/what-is-the-minimal-set-of-words-that-make-a-language-complete — Sir Cornflakes, Mar 06 '17 at 09:45
@Lorenzo You might be interested in the "Historical Thesaurus" section here: http://www.oed.com/public/browse/browsing — SAH, Aug 04 '17 at 17:49
Does this answer your question? What is the minimal set of words that make a language "complete"? — curiousdannii, Feb 11 '20 at 05:48

score 4 · Answer 1 · answered Feb 10 '20 at 23:35

I would say the most frequent words in a given language can be considered core words. Nation (1990) showed that 1000 words account for 85% of spoken speech, for example. Sometimes people call this an example of the Pareto principle (like in this blog).

You ask for a subset of words that "are enough to create a meaningful description of all the other words in a language". I think 1000 words is enough. Think about it: it seems like you could always take some non frequent word like "toothbrush" and reduce it to: that thing you use to clean the white things in your mouth.

Cheers!

Nation, I. S. P. (1990) Teaching and learning vocabulary, Boston: Heinle and Heinle.

Maybe Zipf's Law is more relevant here? – Adam Bittlingmayer Feb 12 '20 at 18:52 — Adam Bittlingmayer, Feb 12 '20 at 18:52

score 4 · Answer 2 · answered Feb 11 '20 at 05:48

4

This is of course highly debated, but some linguists would answer yes, there is a small set of words/concepts common to all natural human languages. The major theory currently representing this view is the Natural Semantic Metalanguage, which posits that there are around 66 core 'semantic primes' which are both irreducible and universal. These primes are usually words, but some will in some languages be expressed by affixes or set phrases.

answered Feb 11 '20 at 05:48

curiousdannii

6,193
5
26
48

Seems too strict and very low, and would mean that many function words are not core words, and that it's hard to even form a sentence with only core words, – Adam Bittlingmayer Feb 12 '20 at 18:54
Eg modern English lacks a word for 2nd-person plural (or rather the distinction) and most languages lack a dual. But those are still core for those that have it. – Adam Bittlingmayer Feb 12 '20 at 18:56
@Adam the list was developed from empirical evidence. It doesn't matter if it feels like there should be more core words. Why does every function word have to be irreducible? – curiousdannii Feb 12 '20 at 21:34
Did I write that every function word has to be irreducible? But many of them are. – Adam Bittlingmayer Feb 13 '20 at 04:14

How many words can be considered "core words"?

2 Answers2

Linked

Related