Why is constituency needed, since dependency gets the job done more easily and economically?

Question

I do dependency grammar (DG), and my personal view is that dependency gets the job done more efficiently than constituency by far. The average constituency parse (= phrase structure parse) contains almost twice as many nodes and edges as your average dependency parse. Automated dependency parsing is significantly faster than automated constituency parsing. The point is illustrated with the following trees taken from the article on DG in Wikipedia:

Dependency vs. constituency

The dependency parse contains 7 nodes and 6 edges, whereas the constituency parse contains 13 nodes and 12 edges. Compared to constituency, the parsimony of the dependency parse is undeniable. This parsimony reaches through the entire theoretical apparatus. If dependency can do everything that constituency can do, Occam's Razor requires that constituency be ejected in favor of dependency.

The dependency vs. constituency distinction is addressed in the following two Wikipedia articles: https://en.wikipedia.org/wiki/Dependency_grammar and https://en.wikipedia.org/wiki/Phrase_structure_grammar.

I actually believe that constituency is necessary, but only for the analysis of coordinate structures. Otherwise, constituency makes the study of syntax more complex and complicated than it should be.

My question in this area is therefore as follows: What can one do with constituency that one cannot do with dependency? Why is constituency needed?

Could you link the Wikipedia article? Also, could you add a definition of dependency and constituency parsing? Your example is clear enough, but personally I couldn't define the difference between the approaches just based on the example. — robert, May 28 '14 at 17:35
Pick one DG variety, please; they have very different presuppositions and categories, when applied to syntax. — jlawler, May 28 '14 at 17:54
@robert, I have added links to two Wikipedia articles that address the dependency vs. constituency distinction. — Tim Osborne, May 28 '14 at 17:55
Rather badly, however; constituency has little to do with phrase structure grammars. It's a phenomenon of transformations. — jlawler, May 28 '14 at 18:05
@jlawler, the DG I have in mind is my own and that of my coauthors. This DG has been developed in about 20 peer-reviewed articles and book chapters. It is monostratal in syntax, constuction-based, and it assumes trees (as opposed to networks). Some of these articles are available for free, e.g. http://english6.net/t/toward-a-practical-dependency-grammar-theory-of-discontinuities-w11971.html and http://www.coli.uni-saarland.de/~tania/CMGD/Beyond%20the%20constitient.pdf. I am also responsible for the Wikipedia article on DG: https://en.wikipedia.org/wiki/Dependency_grammar. — Tim Osborne, May 28 '14 at 18:35
@jlawler, I agree that from a historical point of view, the notions of "constituency" and "phrase structure" have distinct origins. But the two terms are now viewed by many as basically synonymous. — Tim Osborne, May 28 '14 at 18:38
Well, too bad for them. I'm not concerned with what many linguists believe. Construction-based is good; monostratal is not. Trees are OK, if they are helpful; otherwise, they're just bureacracy. No doubt there are many other presuppositions as well. — jlawler, May 28 '14 at 19:25
@Jlawler, again, if we transit to the discussion of concrete examples, we can overcome our varying presuppositions. If you produce a concrete sentence and some discussion of it that illustrate why constituency is needed, we can have an exchange that is profitable for us and this forum. Note that the question contains two concrete parses of an actual sentence. — Tim Osborne, May 28 '14 at 19:29
And, as I said before, the idea that one kind of grammar can do everything is ridiculous. Language (and therefore grammar) is part of a living system, and living systems almost always choose every possible way of doing anything, producing defaults and overlaps and redundancies galore. That's normal. What's not normal is keeping syntax, semantics, pragmatics, and phonology in sealed compartments with only interfaces between them. Nothing living is built like that; that's an artifact of computer science. — jlawler, May 28 '14 at 19:31
Since I don't know what presuppositions are built in to your system, I don't know what kind of thing you would accept as valid. I am not doing Emic syntax here -- I don't care what the big theory says; they never work anyway -- I'm only interested in consistent and useful description. Just the Etics, ma'am, just the Etics. — jlawler, May 28 '14 at 19:32
@Jlawler, we are being warned to limit our comments. Let's wait and see. Perhaps someone else will produce an answer that will allow for more exchange. — Tim Osborne, May 28 '14 at 19:33
Yes, this is not the forum for a discussion. The best we can do is write papers at each other, and nobody's paying me to do that any more. — jlawler, May 28 '14 at 19:33
However, if you want an example, try The Cliff's on Equi and Raising. There are mentions of grammatical relations -- since these are complement clauses, they're either subject or object of the matrix predicate -- but otherwise it's pretty much constituency. Generative semantics, in fact. Diagram on p.5. — jlawler, May 28 '14 at 19:49
@jlawler, please consider putting your examples in a coherent answer with some discussion. Doing so can be worth the time. I have notified numerous colleagues about our exchange. Your answer will be read with great interest. — Tim Osborne, May 28 '14 at 20:37
It's a matter of notational variation. Why care which you use? It's like using Polish prefix notation instead of infix notation. — Greg Lee, Feb 21 '15 at 13:44
@Greg Lee, you are not alone in that claim. I, though, disagree strongly. You can do things with constituency that you can't do with dependency. The pertinent question for me, though, is whether the things you can do with constituency that you can't do with dependency are worth doing. I don't think they are. — Tim Osborne, Feb 21 '15 at 14:00
@TimOsborne, why is that a pertinent question? It's still just a notational issue. Anyhow, in another comment I gave a phonological example of something you need constituents available for -- output constraints on phonological rules. — Greg Lee, Feb 21 '15 at 14:20
@Greg Lee, I'd enjoy going over this distinction with you in a meaningful way. Again, here's my email address: tjo3ya@yahoo.com. Trying to convey a message in such little space here is not going to work. But just one parting shot, if constituency and dependency are essentially notational variants of the same thing, then one should be able to do everything with the one that one can do with the other, right? I'd enjoy learning how c-command can be understood in dependency-based structures. — Tim Osborne, Feb 22 '15 at 05:11
@TimOsborne, No, that's not right. In the analogy I made, Polish prefix notation can be parenthesis free. You choose notations for convenience or familiarity. (My email is not functional right now.) — Greg Lee, Feb 22 '15 at 14:14
Hi Timothy, you were mentioned in this Reddit discussion. I am very interested to have your opinion on the matter if you don’t mind. — Géry Ogam, Oct 20 '20 at 23:41

score 12 · Answer 1 · 2015-02-25T10:53:31.577

The long answer would be very long, but the short one is short enough: Your question 'begs the question'; it is simply not true that 'dependency gets the job done more easily and economically'. What's more, it does not get the job done at all (if by the job we mean the job the best constituency-based grammars already do, and with great precision, as a matter of course).

A (syntactic) 'dependency' is just a syntagmatic relation between two minimal signs (under DG assumptions), and, of course, to that extent, DG's basic 'tool' is extremely flexible (so are part-whole relations, though), and that's why it has usually attracted the attention of computational linguists, rather than'pure' linguists, but most syntagmatic relations are irrelevant, why they are irrelevant must be explained, irrelevant dependencies should not be represented in DG diagrammes if they were really minimal and as efficient as you claim (but they are represented, and the diagrammes are not minimal at all), and those dependencies considered relevant to DG must still be properly defined, because, as far as I know, even 'subject of', 'object of', 'adjunct of' (and, correspondingly, 'predicate of', 'argument of', 'Agent of', 'Patient of', etc.) have not yet been properly defined in DG.

To illustrate with just an example, in My wife has told our eldest daughter to clean the kitchen after today's party (a slightly more elaborate sentence than one cited in the Wikipedia article on 'Catena') wife is not the 'subject', nor daughter the 'object', nor to the 'complement' of has (or told, for that matter). Needless to say, neither is kitchen the 'object' of clean, nor party the 'complement' of after, nor after the 'adjunct' of clean, etc. etc.

If DG wants to call such dependencies 'subject', 'object', etc. it must develop a completely new theory of syntactic functions, and that, of course, will be at a heavy price. I suspect that even if that were possible at all, which I doubt, the resulting theory would certainly not be simpler, nor more efficient, than the extant constituency-based one, on the contrary.

Correspondingly, at the semantic level, wife, daughter, to, kitchen, party, etc. are not 'arguments' of their respective '(maximally unsaturated) predicates' has/told, clean and after, either, and calling them 'arguments', and assigning 'semantic roles' to them will require massive adjustments in semantic theory and its associated ontology. Just think of the fact that arguments (in the sense we are using the term now) must be referential, i.e., their names must denote (first-order) entities, whereas, of course, wife, daughter, kitchen, party, etc. do not denote individuals at all.

As far as I can see, that, by itself, is an unsurmountable difficulty, but there is more: then, DG will need 'correspondence rules' to 'link' syntactic to semantic terms, and, to my knowledge, those have not been developed either.

Note that such objections emerge even at the most elementary level of metatheoretic analysis, in the simplest grammatical sentences, but there is rather more than that to account for in a natural language, isn´t there? There is, at least in many languages, an extremely subtle 'word order' (just think of the ordering restrictions among the more than forty different classes of adverbs that Cinque has identified), and, of course, there is 'scope','displacement', 'discontinuity', 'island (accessibility) constraints', 'minimality effects', 'superiority effects', 'reconstruction effects', and there is 'binding', and 'control', etc., etc., etc.

DGs seem to 'work' (of course, only as elementary parsing strategies, ignoring all the above-mentioned difficulties) because they are typically used to parse only the simplest well-formed sentences, as happens in CL work, i.e., they are 'restricted prototypes', trivial toys, at bottom, but a respectable linguistic theory must also explain why ill-formed sentences (or interpretations thereof) are ill-formed, and to do that you must do rather more than draw or not draw arcs between signs more or less at will.

You do not account for wh-movement by just drawing an arc between What and buy in What did he say he wanted to buy?, nor explain the ungrammaticality of '*Who did you say that wanted the job?' by not drawing any arc between who and wanted, nor account for the twofold dependency between the subject He and the two verbs has and playing in He has been playing the guitar by drawing a couple of arcs between He and -s and play, nor predict the binding phenomena by drawing another arc between He and himself in He promised Tom to do it himself or omitting it between He and him in He promised him to do it himself, etc.

I could go on like this more or less indefinitely, in virtually any aspect of a proper syntactic-semantic theory, but will not. The answer to your question is very simple: DGs cannot oust state-of-the-art constituency-based grammars because they are neither empirically comparable to them in any respect, nor conceptually or representationally any simpler than they are (once all the auxiliary assumptions needed to reconcile DG analyses with the empirical facts are properly spelled out). So far, they have not been, and with good reason: DG is still resting on theories of syntactic and semantic functions that are incompatible with it.

Of course, going any further is out of the question: even 'translating' the principles of current CG-based grammars into a framework that denies the existence of phrasal dependents (in spite of the fact that there is unquestionable evidence that natural language is structure-dependent) may well be literally impossible, as partly explained above, and as to DG's alternatively developing equivalent principles of its own to properly handle the empirical effects of constituency, allowed and disallowed displacement, correct and incorrect scope, minimality, superiority, island constraints, correct and incorrect binding, control, ellipsis, deletion and empty category effects, etc., well, I will not say it is 'impossible', but none of that remotely exists yet and everything suggests that if a proper theory of 'all that' ever comes to be developed from a strict DG point of view, it will be rather more stipulative and complex, and rather less efficient, than state-of-the art constituency-based grammars.

You are welcome to explore and try to promote any new theory you fancy, but it is pretentious of DG fans to design little toy grammars and ignore the enormous body of knowledge that CGs have managed to offer us after sixty years of colossal intellectual work by thousands of the best linguists the world has ever produced.

In closing, I would like to expand my critical remarks by elaborating on a simple, but powerful (I hope) would-be conceptual argument against Dependency Grammar as programmatically presented in its foundational manifestos. It goes, more or less, as follows:

The branches of a traditional constituency-based phrase structure tree represent syntactic relations between the mother node and its daughter nodes. As a consequence, if the sentence S (say: John sent Mary flowers) is (just for the sake of argument!) analysed as a 'flat' structure with four branches J + s + M + f, there must be four relevant syntactic relations between the daughter nodes and its mother node. This is, indeed, the case: the relevant syntactic relations are Subject of (S) = John, Head of (S) = sent (let's not bring Infl or T into the picture here, OK?), Indirect Object of (S) = Mary, and Direct Object of (S) = flowers.

Alternatively, if we opted for a binary-branching analysis, as in current Merge-based theories, the sentence S would have just two branches, which, ignoring the actual labels now used, I will simply call Subject of (S) = John and Predicate of (S) = sent Mary flowers.

However, if we had not specified that (under that analysis!) the mother node must be the pivot of such syntactic relations, our initial flat tree would also represent irrelevant syntagmatic ‘connections’, mediated by the mother node S, between J and M, J and f, or M and f to which no known relevant syntactic function can be assigned.

Since CG rests on part-whole relations, that constraint follows from the CG approach as an inherent property and no conceptual problem arises, even if we opt for the flat analysis, and, of course, does not arise at all if we opt for a binary-branching one (which can be turned into an argument for binary-branching analyses, by the way).

Now, suppose we say with Dependency Grammars that the phrasal ‘mother’ node is irrelevant (i.e., that part-whole relations are irrelevant to syntax) and that the only syntactic relations that count are ‘part-part’ ones between the elements of John sent Mary flowers, i.e., J, s, M and f. That hypothesis predicts the existence of relevant syntagmatic relations between 1) J and s (or s and J), 2) M and s (or s and M), and 3) f and s (or s and f), and let’s grant that, in this simple case, it is possible to label them, respectively, ‘subject of’, ‘indirect object of’, and ‘direct object of’ sent. [The opposite strategy, to say that 'verb of'(J)= sent & 'verb of (M) = sent & 'verb' of (f) = sent, would leave us in the dark as to the functions of J, M and f, and, of course, would be too uninformative as to be worth considering].

Such a flat analysis, however, also predicts the existence of additional syntagmatic relations between 4) J and M (or M and J), 5) J and f (or f and J), and 6) M and f (or f and M) for which no known syntactic function label is available (a problem that would not have arisen under a binary-branching analysis of S, recall).

The way to account for the irrelevance of the syntagmatic relations 4), 5) and 6) is to stipulate that only relations in which the verb is involved count (for clause-level syntax). In other words, there must be a special designated term which somehow acts as the ‘pivot’, ‘head’ or ‘root’ of the whole structure, and that, under DG assumptions, is, in this case, the verb sent.

However, for as long as ‘the whole structure’ is not recognized as a relevant syntactic object at all, there is no way to even formulate that property of sent: what is sent the 'root' of in the proposition Head/Root of (?) = sent? What is ? in such an equation?

Of course, it must be the ‘phrase’ S, in this case, ergo phrasal nodes must be syntactically relevant categories or it would be impossible to define sent as the ‘root’ or ‘head of’ anything. Q.E.D.

If I am not terribly mistaken, this is a simple, but valid, conceptual argument against all DG theories to the extent they remain faithful to their foundational manifestos. If they withdraw their foundational claims and admit that phrases are syntactically relevant objects and terms of bona fide syntactic dependencies, of course, this argument no longer applies, but, if they do, DGs can hardly be as perspicuous and efficient as CGs. Note that whereas in CG approaches, say X-bar or Merge-based syntax, the theory automatically defines a transparent correspondence between phrases and their heads (if we know the label of the head, we know the label of the phrasal node, and viceversa), in DG such correspondences must be stipulated (as happened in early PS rules!).

This seems to me a simple, but cogent, reason to challenge the alleged superiority of DGs over state-of-the-art CGs, even if we ignore the theoretical vacuousness of DGs in chapters as important as those mentioned above and compare them to CGs only as mere diagramming tools.

[Just in case you think I have an axe to grind in this matter, let me tell you that I am by no means an orthodox Chomskyan linguist; I'm just a self-taught linguist and an intellectually open scholar who has bothered to take a good look at many other people's gardens and know very well what it has taken LFG, GPSG-HPSG, CG, OT, FG, Word Grammar, Cognitive Grammar, FDG..., you name it, to barely mimick in their own terms just the most flagrantly necessary components of what Chomskyan P&P Theory had already achieved thirty years ago. What is a pity is that the subsequent 'minimalist programme' has largely repudiated much of that work and taken refuge in a ridiculously restricted concept of Human Language as, basically, free recursion, but that is a different matter].

Well, I've read your entire answer. Of course I disagree. It appears as though you took the great chomskian leap of faith long ago, and there's no looking back. We're not going to get anywhere with an exchange on this issue. — Tim Osborne, Feb 22 '15 at 11:54
You make a good point. Drawing arcs between words more or less at will is not science. There is a lot of nonsense in the "research" on DG and it is naive, to put it mildly, to think that DG could replace constituency-based theories. Dependencies are useful in the description of language, but that doesn't mean that one can dispense with constituency. — Atamiri, Feb 22 '15 at 12:41
@Atamiri, to repeat what is stated in the question, I do not believe that constituency should be dispensed with entirely. I think it is necessary for coordination. Note that the answer degrades LFG, your favorite flavor of constituency. — Tim Osborne, Feb 22 '15 at 13:38
@TimOsborne A theory of grammar has to explain while certain sentences are well-formed while others are ill-formed. Everybody knows how a dependency tree should look like. But are you able to formulate rules within DG that describe well-formed sentences while ruling out ungrammatical ones? Examples would be nice. By rules I mean precise procedures or constraints that leave no room for diverging interpretations. — Atamiri, Feb 22 '15 at 14:20
@Atamiri, Which of my DG papers would you like to read? Again, contact me via email (if you're bold enough to identify yourself). I'll very happily share our (my and my coauthors) articles with you and explain them all. Note that they are all in good, peer-reviewed linguistics journals. — Tim Osborne, Feb 22 '15 at 14:28
@TimOsborne I'd like to see concrete rules, not just trees, that's the point. — Atamiri, Feb 22 '15 at 14:38
@OK, give me an example or two of how these rules should look. What sort of symbols do you want me to use? What phenomenon do you want me to address using these rules? Note that you can cite the rules in Bresnan's book (2001). Perhaps I can render them in a manner that is dependency-based. — Tim Osborne, Feb 22 '15 at 14:43
@Atamiri, better yet, send me an email. Note that I won't share your address with anyone. Eva knows me; she can vouge for me that I'm not dangerous. I'm not going to sell your address to anyone. — Tim Osborne, Feb 22 '15 at 14:45
Please re-read the last paragraph of my answer, do not misrepresent what I said, and face my criticism if you can. I think that, if you want to know where your DG is, theoretically speaking, you can start by asking yourself this 'simple' question: can my DG specify under what conditions an X-dependency can hold between A and B (for X = rection, selection, agreement, binding, control,... topicalization, wh-movement, raising,... subject, object, adjunct, etc.)? If it cannot, your DG may be another parsing and diagramming tool, but not a theory of syntax, semantics or, even less, Language. — , Feb 23 '15 at 14:07
@Sibutlasi: Here's the key part from your last paragraph: "...LFG, GPSG-HPSG, CG, OT, FG, Word Grammar, Cognitive Grammar..., you name it, to barely mimick in their own terms just the most flagrantly necessary components of what Chomskyan P&P Theory had already achieved thirty years ago". That paragraph degrades LFG, HPSG, CG, OT, FG, Word Grammar, Cognitive Grammar. Choice expressions such as "barely mimicking" are degrading. But you forgot to mention TAG, CxG, RRG, RG, MTT, FGD and certainly others as well. I think you need to do more reading. — Tim Osborne, Feb 24 '15 at 06:46
@Sibutlasi, your answer misrepresents DG quit seriously. One example, you write: "a framework that denies the existence of phrasal dependents". DGs acknowledge phrases. A complete subtree is (consistenting of two or more words) is a phrase. I encourage you to contact me via email: tjo3ya@yahoo.com. I will hook you up with some literature that will help correct your misconceptions about DG. — Tim Osborne, Feb 24 '15 at 06:54
@Tim Osborne. Ars longa, vita brevis. I certainly cannot pretend to be up-to-date in what is published in ALL 'schools' of linguistic thought. Nobody can, probably, but most of what gets published is not really worth reading, not once you have a sufficiently ecumenical cross-theoretical view of the field. Nevertheless, you may add TAG, RRG and RG to the list of 'incomplete' theories of language if measured against the mature versions of P&PT. If DG now acknowledges the relevance of phrases, SOME of my criticism above may no longer apply, but then your own challenge to CGs seems unfounded. — , Feb 25 '15 at 22:27
@Sibutlasi, I've checked out your webpage. You are a knowledgable, accomplished linguist. But you're misinterpreting and misrepresenting DG. If you want to go over it with me in a way that would be more meaningful, send me an email (tjo3ya@yahoo.com). I will be respectful. — Tim Osborne, Feb 26 '15 at 01:03
Thank you thank you Sibutlasi for such a thoughtful and insightful answer. I have been trying to make sense of why DG is so popular (or should I say, fashionable) and your response covers so many of the observations and questions that I've had in trying to make sense of this. I will be using constituency grammar for the analysis I am conducting - thank you so much for giving me many more sound reasons for why I should do that rather than DG - and for helping make sense of why DG seems to have become the 'hipster grammar' de jour. — Collega, Mar 30 '21 at 23:38
"You are welcome to explore and try to promote any new theory you fancy" A DG theory that adresses the mentioned points has been developed since the 1960s (so approximatively since the beginning of Chomskyan CG), it is called the Meaning-Text Theory. "A theory of grammar has to explain while certain sentences are well-formed while others are ill-formed." This is questionable, because many utterances in one language are neither completely well-formed or completely ill-formed, this has to do with diachronic and diatopic variations. — Starckman, Jun 22 '21 at 13:40

score 10 · Answer 2 · answered May 28 '14 at 21:10

10

As a proponent of construction grammar, I am perhaps the wrong person to answer this. But I can see at least two non-computational advantages of a constituency parsing:

It lets you directly encode generalizations about the individual components that make up a clause or sentence. Your constituency tree does not take advantage of it (perhaps being influenced by dependency approaches).
It does not require a commitment as to the head of every phrase. So let's say you change your mind about verbs heading the VP, you can do it without changing the parsing.

Of course, you can get the best of both worlds with something like Categorial Grammar or Unification Categorial Grammar.

But I could be wrong both about my answer or about the point of your question.

answered May 28 '14 at 21:10

Dominik Lukes

10,588
27
49

Thanks for your answer. Concerning construction grammar (CxG), I too am a fan. My colleagues and I have devoted some effort to demonstrating that DG and CxG are quite compatible. See here: http://www.degruyter.com/view/j/cog.2012.23.issue-1/cog-2012-0006/cog-2012-0006.xml?format=INT. I do not understand your first point, and I concede the second point in a sense. If one wants to group words together without acknowledging heads, constituency is better for that, since dependency cannot by its very nature acknowledge headless constituents (exocentric structures). – Tim Osborne May 28 '14 at 21:30
However, I question what the advantage is in acknowledging headless constituents. It seems like capitulation, instead of coherent analysis. Categorial Grammar as I have encountered it used is entirely constituency-based. I know that some have claimed that Categorial Grammar is dependency grammar, but for the life of me, I do not see how they can make that claim. – Tim Osborne May 28 '14 at 21:33
2

@TimOsborne My first point was really more about the visual representation. If I label my nodes as NPs and VPs, I can more easily see the sentence structure without parsing it. So pedagogically, if I want to represent a sentence, I would start with a constituency parse. I was brought up on a dependency-based diagramming approach but in many ways it obscured important relationships.
From a construction grammar perspective, I don't want to talk about heads or any dependencies at all. I'm just interested in the unification/meronymy parameters. Anything else is an external generalization.
– Dominik Lukes May 28 '14 at 22:45
1

I agree about Categorial Grammar. Its trees are entirely constituent based, however, the nodes encode the dependencies in a quite an elegant way. But for the purposes of parsing, you're basically doing a constituent analysis. – Dominik Lukes May 28 '14 at 22:49
you state that you were raised on DG. By whom? In what DG framework? Just curious. – Tim Osborne May 28 '14 at 23:04
I'm still not understanding your first point. As soon as one acknowledges NPs and VPs, one is parsing. One can acknowledge non-descript constituents, but as soon as one assigns labels (NP, VP), one is acknowledging heads and dependents. Right? – Tim Osborne May 28 '14 at 23:10
I went to Czech primary schools where I got the a diet of Tesniere-style dependency diagramming and then studied linguistics with Sgall and Hajicova in Prague and learned FGD. Although, I can't claim any expertise in it. – Dominik Lukes May 28 '14 at 23:10
Agreed on the nature of labels. But the dependency is only in the metalanguage, not in the structure. Labels are easier to change than the edges of a tree. – Dominik Lukes May 28 '14 at 23:17
Yes, dependency is only in the metalanguage in constituency structures, and yes, labels are easier to change than edges in a tree. So I guess I sort of concede the point. From another perspective however, producing a dependency parse requires much less effort than producing a constituency parse due to the minimal number of nodes and edges in the dependency parse (or brackets if one is using brackets). You have an interesting background; I am a bit envious, since my education in syntax struggled with syntacticians who had no knowledge of or patience for dependency. – Tim Osborne May 29 '14 at 00:29
Classical Reed-Kellogg sentence diagramming is primarily (though not completely or consistently) dependency-based. It makes a mash out of constituents, for the most part. That's what I was brought up on, but it was clear even in grade school that it wasn't practical for representing real language. Sorta like the Initial Teaching Alphabet for syntax, only not as successful. – jlawler Jun 02 '14 at 15:11
@Jlawler, we again agree completely. The Reed-Kellogg system is a mish-mash, which sews much confusion in the long run. It is mostly dependency-based, but it has two constituency-based divisions: the initial subject-predicate division and the verb-object division. Otherwise, it is almost entirely dependency-based. – Tim Osborne Jun 03 '14 at 19:00
Right. I would probly turn it upside down. I think dependency is terrifically important -- primary, in fact -- with regard to grammatical relations (1, 2, 3), which was constituent-based in R-K, but constituency is much more important outside that sphere. Naturally one needs to be able to address either, but there are clearly places where one is more relevant than the other. And while we're at it, we should probly mention semantics, pragmatics, sound symbolism, intonation, gaze direction, facial expressions, and integrated gestures, which are also important to be able to address. – jlawler Jun 03 '14 at 19:25
Categorial Grammar works better for cyclic phonology, since phonological rules can potentially refer both to the parts of a form and to the form with parts combined. Flapping of t is possible in "at Akron" because t is at the end of "at" and because it is followed by a vowel in the form resulting from concatenation of "at" and "Akron". – Greg Lee Feb 21 '15 at 13:56

score 7 · Answer 3 · answered Feb 21 '15 at 12:22

7

I hope I correctly understand the question as being a general one, rather than particularly about automated parsing.

Here's what I was taught in Syntax and believed ever since (but maybe I missed some advances of the dependencies framework).

In general: Constituency, but not dependency, shows units on which syntax operates. I.e., constituency reflects the fact that syntactic processes target phrases, rather than words (or sub-trees of the dependency tree), no matter how large those phrases are and what structure they have. Admittedly, dependencies are easy for selectional restrictions and are applicable to a wider range of phenomena "as is", without further additions/stipulations.
Dependency doesn't, and constituency does, capture recursion. Syntactically, (1) and (2) are units of the same class:

(1) the dog, that worried the cat, that killed the rat...

(2) the dog

For instance, they have the same distributional properties, which is immediately visible with labeled constituents.

Constituency has a straightforward way to express that one of the dependents is stronger, or closer, connected to the head than another one. E.g., in English this is reflected by word order in certain adjuncts:

(3) builds houses in Brunswick

(4) *builds in Brunswick houses

In Russian PPs the NP is "closer" to the head P than the adverb is, both semantically and shown by word order:

(5) [prjamo [nad oknom]]
     right above window
'right above the window'
(6) *nad prjamo oknom
(7) [[nad oknom] prjamo]

And probably English does the same.

To refresh this in my head I looked into Yakov Testelets's excellent "Introduction to Syntax" (in Russian). He mentions several attempts to have the best of the two worlds, but haven't read any of them: Hudson's "grammar of words", Borschev and Khomiakov's "club systems", and Gladkiy's "systems of syntactic groups".

answered Feb 21 '15 at 12:22

Ivan Kapitonov

1,086
1
7
21

Thanks for your answer! I think your interpretation of the points you produce is indeed outdated. I would like to respond to each of your points with a few sentences, but doing so would take up more space than allowed here. I can, however, just give a sense of how the points your raise may no longer be accurate (and actually never were) concerning dependency-based structures. If you are interested in a more meaningful exchange, I encourage you to contact me via email: tjo3ya@yahoo.com. – Tim Osborne Feb 21 '15 at 12:30
Concerning your first point, dependency structures acknowledge phrases, too. A phrase is a complete subtree consisting of two or more words. Many mechanisms of syntax are targeting complete subtrees in DGs, which are XPs in phrase structure grammars. Dependency captures recursion in a similar way to constituency. Smaller complete subtrees are contained inside larger complete subtrees. This is true of your "dog...cat...rat" example. – Tim Osborne Feb 21 '15 at 12:35
Sure! I guess a few notes here will be useful for the community, but I'll want to know more. – Ivan Kapitonov Feb 21 '15 at 12:36
Concerning your comments about word order, the examples you provide can be addressed in a closely similar way to how the are addressed in a constituency grammar. In particular, some dependents must precede their head, while others must follow their head. In your example, as an adverb "prjamo" must be a predepdent of "nad" and must therefore precede it, and as a noun "oknom" must be a postdependent of "nad" and so must follow it. – Tim Osborne Feb 21 '15 at 12:40
My final comment here concerns the historical development of DG. I think your view of DG is representative of how DG has been traditionally viewed. It abstracted away from actually word order (linear order) in order to focus more on hierarchical order. This left the inaccurate impression that DG was not acknowledging the same sort of units of syntax that constituency is, namely phrases. A phrase in DG is simply a complete subtree, and you can do everything with these complete subtrees than you can do with phrases in constituency grammars. – Tim Osborne Feb 21 '15 at 12:44
Well, but I intended (7) to show that "prjamo" can jump around. And then you'd need to have a component tracking linear precedence, which "constituent"-people try to derive from simpler things. – Ivan Kapitonov Feb 21 '15 at 12:45
1

Ah, yes, I see. The difference has to be explained in terms of the complement vs. adjunct distinction, and yes, constituency might appear to have an advantage insofar as it can group "nad oknom" as a constituent to the exclusion of "prjamo". Note, however, that both approaches acknowledge the distinction between complements (arguments) and adjuncts. The DG approach has to appeal to the fact that arguments (Tesniere's actants) tend to appear closer to their heads than adjuncts do. – Tim Osborne Feb 21 '15 at 13:02

prash · Answer 4 · 2014-06-03T20:58:51.017

5

My answer is partially motivated by Dominik Lukes's answer and some of the things I read in the comments that follow it.

Chunking is a kind of constituency parsing. The theoretical motivation for chunking comes from constituency parsing, à la NP, VP, etc. Though chunking is considered to be only "shallow" parsing, i.e. not the proper kind, it is still useful for practical applications.

For example, I'm working on a project right now that needs the software to deal with tweets. Depending on your mindset, such text can either be considered "awful" or, be considered to be of a dialect that parsers haven't been trained for. Either way, "full" parsers of either kind, constituency or dependency, fail pretty badly. Chunking gives me useful nuggets of information.
Leaving aside the fact that it's my own fault that I know nothing about Dependency-Based Compostional Semantics¹, constituency-based parse analyses map fairly easily to compositional semantics. I believe this claim applies equally well to deeper grammars (LTAG, HPSG, CCG, etc.) that are the intellectual children of the simple Phrase Structure Grammar. After all, the parse trees don't change much between one of these formalisms and another.

P.S.: I'm not convinced that automated dependency parsing is significantly faster than automated constituency parsing. The speed of parsing depends on many factors such as the algorithms used, the programming language, the quality of implementation, etc. All I can acknowledge right now is that some parsers are fast and some are slow.

^{1 — The first time I came across DCS was just a few minutes ago when I googled up some phrases while answering this question.}

edited Jun 03 '14 at 20:58

answered Jun 03 '14 at 18:51

prash

3,649
3
25
33

thanks for your answer. You have brought a couple of things to my attention that I was not aware of, or that I had forgotten about. Abney's paper is of particular interest to me, since it bears directly on a current project of mine. I too was not aware of DCS; I'm going to have to look at that more closely. Concerning the speed of dependency parsing, I am a theoretical linguist, not a computational one. I am simply taking the DG computational guys at their word, e.g. Joakim Nivre, Covington, Chris Manning, etc.; they state that DG is faster and as or more accurate. – Tim Osborne Jun 03 '14 at 20:24
Concerning chunking as illustrated with the first example in Abney's paper, it is not consistent with your claim in the first point of your answer. Abney's sentence is as follows: "[I begin] [with an intuition]: [when I read] [a sentence] [I read it] [a chunk] [at a time]". Some of the chunks in this example are clearly not constituents. "I begin" is not a constituent; "when I read" is not a constituent; "I read it" is not a constituent. In other words, chunks are often NOT constituents. Your first point therefore fails. Chunking provides no evidence for constituency over dependency. – Tim Osborne Jun 03 '14 at 20:32
However, the chunks in Abney's example are all "components", a type of unit that we are now acknowledging in our DG. The component is both a string and a catena. See the definition of the catena and the component in the followig Wikipedia article: https://en.wikipedia.org/wiki/Catena_%28linguistics%29. Thus dependency can define the chunks in a very concrete way, whereas constituency cannot do the same. Conclusion: chunking actually supports dependency over constituency. – Tim Osborne Jun 03 '14 at 20:41
@TimOsborne: I would ignore the examples given in Abney's paper. Most chunkers I worked with just help identifying NPs. This is how one of them looks: '"[Linar]" by [Nastia Tarasova]: Do not miss [the opening film] of [the #tdf16] [today] at [20.30] at [Frida Liappa theater].' This was GATE's chunker. NLTK's chunker (one of the default models) also does something like this. – prash Jun 03 '14 at 20:46
Concerning compositionality, my coauthors and I have argued extensively that the catena is the most basic meaning-bearing unit of syntax and morphology. The catena is associated more with dependency than with constituency. The catena allows one to give idioms (e.g. pull X's leg, throw X to the wolves, dance on X's grave), which are meaning bearing units, a concrete expression in the syntax. Constituent-based syntax cannot do the same. See again the article on the catena in Wikipedia: https://en.wikipedia.org/wiki/Catena_%28linguistics%29. Thanks again for your valuable answer. – Tim Osborne Jun 03 '14 at 20:51
The examples you have now added are complete subtrees (= constituents) in both dependency- and constituncy-based syntax. In other words, both parses, dependency and constituency-based ones, acknowledge NPs as constituents (= complete subtrees). The difficulty here probably has to do with terminology, since one does not tradiitionally associate constituents with dependency-based syntax. But see the example trees in the question; both show the NP "the difference" as a constituent (= complete subtree). – Tim Osborne Jun 03 '14 at 21:01
@TimOsborne, I skimmed the wikipedia entry, do you mind adding some examples of complex NPs there? I did not see edge labels either. How would you deal with sentences like "The Free Software Foundation supported Electronic Frontier Foundation in its fight against snooping."? How do you differentiate between subject and object? And if you don't have labels, how would you say something is an NP? – prash Jun 03 '14 at 21:18
Yes, to distinguish between subject and object, dependency traditionally labels the dependency edges with SUBJ, OBJ, etc. I did that when I wrote this Wikipedia article, for instance (see the bottom of the page): https://en.wikipedia.org/wiki/Syntactic_function. But I usually leave the labels off, since they are not necessary for the point being made. Creating the trees for Wikipedia articles is laborious. Perhaps I can do that another time. I will be happy to produce trees for you, however, using Word. I can send you a docx, if you provide an email address. My email is tjo3ya@yahoo.com. – Tim Osborne Jun 03 '14 at 22:55
I'm not sure that DCS helps your argument, prash. On just 10 pages, the authors present a "new" way of representing compositional meaning. Disregarding that I find this "way" not really illuminating, I must put forward that it's not compositional meaning that's difficult, but rather non-compositional meaning. The catena is the best tool to assign non-compositional meaning to surface forms, starting with William O'Grady. – Thomas Gross Jun 04 '14 at 07:47

score 4 · Answer 5 · answered Jun 02 '14 at 07:10

I don't think that constituency is necessary, although I acknowledge the notion of "constituent" (I just don't think it's the central notion on which language structure is built). Of course, I have to admit that many linguists seem to view the constituent as an indispensable tool.

My impression is that rather than questioning the necessity of constituency, its suitability in producing coherent and accurate answers to linguistic phenomena should be in the focus. (Since it's the prerogative of the questioner to formulate their question, my contribution here is not really an answer.)

My personal doubt about constituency stems from a solid conviction that the notion incurs bracketing paradoxes across the whole grammatical spectrum. Be it displacement, ellipsis, periphrasis, or morphosyntactical phenomena such as multiple auxiliaries, one always needs additional assumptions (movement, merge, etc.) in order to account for the fact that stuff that should go together doesn't do so.

Constituents simply don't seem to be the conceptional units along which language is structured.

Not language, no. But syntax, yes. (BTW, by syntax I mean specifically the part of grammar that deals with grammar outside the word; i.e, I'm not including morphology or phonology, though they do influence syntax, like semantics and pragmatics.) — jlawler, Jun 02 '14 at 14:56
@Jlawler, I've been thinking about your comments above. Why don't you produce an answer according to your presuppositions about the nature of dependency and constituency. I could then see where our presuppositions diverge. You are obviously extremely knowledgable. I would profit from such an answer. — Tim Osborne, Jun 02 '14 at 15:13

score 0 · Answer 6 · answered Nov 20 '18 at 06:24

Linguistics is the science of language (as Max Muller proposed). What is important in science is the truth, not (as the question implies) how economical illustrative diagrams can be made. Reducing the number of nodes in structure trees is utterly irrelevant.

Why is constituency needed, since dependency gets the job done more easily and economically?

6 Answers6