<< Prev | - Up - | Next >> |
It is easy to implement a top-down depth-first recognizer in Prolog --- for this is the strategy Prolog itself uses in its search. Actually, it's not hard to implement a top-down breadth-first recognizer in Prolog either, though I'm not going to discuss how to do that.
The implementation will be far better than that used in the naive bottom up recognizer that we discussed last week. This is not because because top-down algorithms are better than bottom-up ones, but simply because we are not going to use append/3
. Instead we'll use difference lists.
Here's the main predicate, recognize_topdown/3
. Note the operator declaration (we want to use our -
-
->
notation we introduced last week).
?- op(700,xfx,--->).
recognize_topdown(Category,[Word|Reststring],Reststring) :-
lex(Word,Category).
recognize_topdown(Category,String,Reststring) :-
Category ---> RHS,
matches(RHS,String,Reststring).
Here Category
is the category we want to recognize (s
, np
, vp
, and so on). The second and third argument are a difference list representation of the string we are working with (read this as: the second argument starts with a string of category Category
leaving Reststring
, the third argument behind).
The first clause deals with the case that Category
is a preterminal that matches the category of the next word on the input string. That is: we've got a match and can remove that word from the string that is to be recognized.
The second clause deals with phrase structure rules. Note that we are using the CFG rules right-to-left: Category
will be instantiated with something, so we look for rules with Category
as a left-hand-side, and then we try to match the right-hand-side of these rules (that is, RHS
) with the string.
Now for matches/3
, the predicate which does all the work:
matches([],String,String).
matches([Category|Categories],String,RestString) :-
recognize_topdown(Category,String,String1),
matches(Categories,String1,RestString).
The first clause handles an empty list of symbols to be recognized. The string is returned unchanged. The second clause lets us match a non-empty list against the difference list. This works as follows. We want to see if String
begins with strings belonging to the categories
[Category|Categories]
leaving behind RestString
. So we see if String
starts with a substring of category Category
(the first item on the list). Then we recursively call matches
to see whether what's left over (String1
) starts with substrings belonging to the categories Categories
leaving behind RestString
. This is classic difference list code.
Finally, we can wrap this up in a driver predicate:
recognize_topdown(String) :-
recognize_topdown(s,String,[]).
Now we're ready to play. We shall make use of the ourEng.pl
grammar that we worked with last week.
We used this same grammar with our bottom-up recognizer bottomup_recognizer.pl
--- and we saw that it was very easy to grind bottomup_recognizer.pl
into the dust. For example, the following are all sentences admitted by the ourEng.pl
grammar:
jules believed the robber who shot the robber fell
jules believed the robber who shot the robber who shot marsellus fell
The bottom-up recognizer takes a long time on these examples. But the top-down program handles them without problems.
The following sentence is not admitted by the grammar, because the last word is spelled wrong (felll instead of fell).
jules believed the robber who shot marsellus felll
Unfortunately it takes bottomup_recognizer.pl
a long time to find that out, and hence to reject the sentence. The top-down program is far better.
<< Prev | - Up - | Next >> |