0
votes

I have been set a task of natural language parsing in Prolog. So far I have the program working to an extent. It will print a sentence, for example if I input a list of [the, cat, sat, on, the, mat] it will correctly output:

(noun_phrase(det(the), np2(noun(cat))), verb_phrase(verb(sat), pp(prep(on), noun_phrase(det(the), np2(noun(mat)))))) 

The next task I have to do is to extract the keywords from the sentence, ie extract the noun in the noun phrase, the verb in the verb phrase and the noun in the verb phrase, so I could return a list: [cat, sat, mat]. Could anybody give me a hand getting started because I'm very stuck with this. Thanks!

My current code is:

sentence(S,sentence((NP), (VP))):-
   nl,
np(S, NP, R),
vp(R, VP, []),
write('sentence('), nl, write('   '), write((NP))
      ,nl,write('    '), write((VP)),nl,write('  ').

np([X | S], noun_phrase(det(X), NP2), R) :- det(X), np2(S, NP2, R). np(S, NP, R) :- np2(S, NP, R). np(S, np(NP, PP), R) :- append(X, Y, S), /* Changed here - otherwise possible endless recursion */ pp(Y, PP, R), np(X, NP, []).

np2([X | R], np2(noun(X)), R) :- noun(X). np2([X | S], np2(adj(X), NP), R) :- adj(X), np2(S, NP, R).

pp([X | S], pp(prep(X), NP), R):- prep(X), np(S, NP, R).

vp([X | R], verb_phrase(verb(X)), R) :- /* Changed here - added the third argument */ verb(X). vp([X | S], verb_phrase(verb(X), PP), R) :- verb(X), pp(S, PP, R). vp([X | S], verb_phrase(verb(X), NP), R) :- verb(X), np(S, NP, R).

det(the). det(with). noun(cat). noun(mat). verb(sat). prep(on). adj(big).
1

1 Answers

1
votes

First, you should note that you are using two different terms with the same name and different arity (np/1 and np/2). That might not be what you want as it may lead to confusion.

To get the list you want you may do a parsing but now on the preprocessed sentence, with something like this:

extract_keywords(L, [N,V,VN]):-
 sentence(L, sentence(NP, verb_phrase(verb(V), pp(_, VP)))),
 noun(NP, N),
 noun(VP, VN).

noun(noun_phrase(_, NP), N):-
  noun(NP, N).
noun(noun(N), N).
noun(np2(_, NP), N):-
  noun(NP, N).
noun(np2(NP), N):-
  noun(NP, N).

The extract_keywords predicate parses the sentence and break apart the noun phrase and the verb phrase, then it parses the sub phrases to get the nouns.

The example I gave you is somewhat weak in the sense that it will work only on these kind of sentences.

Here goes two examples:

?- extract_keywords([the, cat, sat, on, the, mat], X).
X = [cat, sat, mat] 

?- extract_keywords([big, cat, sat, on, the, big, mat], X).
X = [cat, sat, mat]