9

First of all I apologize but my English skills are by far below the complexity of the question I need to ask. I am not a specialist and my question is not related to a single language. I would like to identify the subset of "words" or maybe better, of concepts, that once learned or labeled, for their semantic or logical relevance, are "enough" to create a meaningful description of all the other words in a language. If we call these words or concepts primitives (archetypes), how many of them do we need in a basic dictionary? They basically allow a language to becomes self explicative. I found a question about circular defined words, but that question seems more based on a statistical evaluation of words in a dictionary. I would like to do something similar but to focus on the semantic aspect, on concepts, and then extract the words. This could allow to avoid the "minimum feedback arc set problem" coming from the attempt to do a kind of reverse engineering of the dictionary. Thank you in advance.

Lorenzo
  • 91
  • 2
  • 2
    There is no such measure, because it is a judgement of probability that can't be tested. Essentially it is a matter of opinion that every dictionary editor has to decide on individually. You might survey successful dictionaries of various languages and see what their average word count is, if that's any help. – jlawler Mar 05 '17 at 14:22
  • 1
    Thank you for your answer. I am trying to figure this out, because I think there are "primitives" in every language. In the beginnign you need a direct association between the object, the action, or the relationship (boolean operator for example), but at some point, you can go abstract and use these words to explain everything else. I think that you need a very small dictionary, then you can use sentences to express more complex meanings, and with knowledge you can then summarize these sentences in a single, more complex word. Does this make any sense? – Lorenzo Mar 05 '17 at 18:03
  • 3
    You are not the first to believe that there are "primitives" in every language. But opinions vary about whether every language has its own set, like phonemes, or whether there is some Ur-set that underlies all languages; the latter is much harder to demonstrate, of course. Perhaps a book like Frawley's Linguistic Semantics or _ Foley's Anthropological Linguistics -- both of which go over all the semantic categories known to be used by human languages -- might help you. – jlawler Mar 05 '17 at 18:11
  • Thank you very very much! I will search and read these books (I hope I can understand them :D ), for sure. Probably every language has its own subset of primitives, but many of them are common. All the primitives that are not relative (< is always < can't be = or >) must be common, I would call them absolute primitives. There are then primitives that can be relative, such as hard, cold, hot, small, big, and their relevance is different in many languages. I will go on with my search and I will try to analyze a few dictionaries. Thank you again! – Lorenzo Mar 05 '17 at 18:25
  • 6
    The Natural Semantic Metalanguage approach to semantics developed by Anna Wierzbicka holds that there is a set of semantic primitives, found in every language, which are sufficient to define any word but are not themselves definable. Further information,including on methodology, here. – Gaston Ümlaut Mar 06 '17 at 04:33
  • Another related question: http://linguistics.stackexchange.com/questions/20960/what-is-the-minimal-set-of-words-that-make-a-language-complete – Sir Cornflakes Mar 06 '17 at 09:45
  • 2
  • See https://en.wikipedia.org/wiki/Basic_English.
  • – Greg Lee Mar 06 '17 at 12:40
  • @Lorenzo You might be interested in the "Historical Thesaurus" section here: http://www.oed.com/public/browse/browsing – SAH Aug 04 '17 at 17:49