Representing knowledge extracted from output of dependency parsing

Question

I am working on a problem to represent knowledge extracted from a paragraph and rank it to produce abstractive summaries. I have implemented dependency parsing using Stanford NLP, which gives dot format graph as an output.

The dependency-parsed output of the following two sentences are as follows.

Sentence 1 - John is a computer scientist

Dot format -

digraph G{
edge [dir=forward]
node [shape=plaintext]

0 [label="0 (None)"]
0 -> 5 [label="root"]
1 [label="1 (John)"]
2 [label="2 (is)"]
3 [label="3 (a)"]
4 [label="4 (computer)"]
5 [label="5 (scientist)"]
5 -> 2 [label="cop"]
5 -> 4 [label="compound"]
5 -> 3 [label="det"]
5 -> 1 [label="nsubj"]
}

Graph -

Sentence 2 - John has an elder sister named Mary.

Dot Format -

digraph G{
edge [dir=forward]
node [shape=plaintext]

0 [label="0 (None)"]
0 -> 2 [label="root"]
1 [label="1 (John)"]
2 [label="2 (has)"]
2 -> 5 [label="dobj"]
2 -> 1 [label="nsubj"]
3 [label="3 (an)"]
4 [label="4 (elder)"]
5 [label="5 (sister)"]
5 -> 6 [label="acl"]
5 -> 3 [label="det"]
5 -> 4 [label="amod"]
6 [label="6 (named)"]
6 -> 7 [label="dobj"]
7 [label="7 (Mary)"]
}

Graph -

Now I want to merge this graph at a common node, John. I am currently using graphviz to import dot graph like this,

from graphviz import Source
s = Source(dotGraph, filename=filepath, format="png")

But there seems to be no functionality to merge graphs in Graphviz, or Networkx. So how can this be done?

Commissioner Gordon, turn on the Merge Signal. This is a case for the Biosyntax Squad. — jlawler, Feb 04 '17 at 15:51
What would your expected output look like? How exactly does that help you achieve your stated purpose? — Lefty G Balogh, Feb 05 '17 at 13:38
The goal is to have all the information related to a particular entity, in the same graph. Further, it can be ranked and used to generate summaries. — Riken Shah, Feb 06 '17 at 06:21
I am a theoretical DG guy. I cannot comment on the computational side of what you are doing. I would, though, like to point out that the Stanford annotation scheme is controversial. For instance, your first dependency tree shows scientist as the root of the sentence. From a linguistic point of view, a stronger case can be made for viewing the finite verb is as the root. — Tim Osborne, Apr 12 '20 at 15:14

score 1 · Answer 1 · answered Jun 21 '18 at 23:09

Since you are using CoreNLP to generate dependency trees, a very nice way to tackle your problem would be to use the Tsurgeon library used to manipulate parse trees.

Tsurgeon is a (parse) tree transformation language. (Also check Tregex and SemGrex on the same link.)

Representing knowledge extracted from output of dependency parsing

1 Answers1