|<< Prev||- Up -||Next >>|
It is easy to implement a top-down depth-first recognizer in Prolog --- for this is the strategy Prolog itself uses in its search. Actually, it's not hard to implement a top-down breadth-first recognizer in Prolog either, though I'm not going to discuss how to do that.
The implementation will be far better than that used in the naive bottom up recognizer that we discussed last week. This is not because because top-down algorithms are better than bottom-up ones, but simply because we are not going to use
append/3. Instead we'll use difference lists.
Here's the main predicate,
recognize_topdown/3. Note the operator declaration (we want to use our
-> notation we introduced last week).
Category ---> RHS,
Category is the category we want to recognize (
vp, and so on). The second and third argument are a difference list representation of the string we are working with (read this as: the second argument starts with a string of category
Reststring, the third argument behind).
The first clause deals with the case that
Category is a preterminal that matches the category of the next word on the input string. That is: we've got a match and can remove that word from the string that is to be recognized.
The second clause deals with phrase structure rules. Note that we are using the CFG rules right-to-left:
Category will be instantiated with something, so we look for rules with
Category as a left-hand-side, and then we try to match the right-hand-side of these rules (that is,
RHS) with the string.
matches/3, the predicate which does all the work:
The first clause handles an empty list of symbols to be recognized. The string is returned unchanged. The second clause lets us match a non-empty list against the difference list. This works as follows. We want to see if
String begins with strings belonging to the categories
RestString. So we see if
String starts with a substring of category
Category (the first item on the list). Then we recursively call
matches to see whether what's left over (
String1) starts with substrings belonging to the categories
Categories leaving behind
RestString. This is classic difference list code.
Finally, we can wrap this up in a driver predicate:
Now we're ready to play. We shall make use of the
ourEng.pl grammar that we worked with last week.
We used this same grammar with our bottom-up recognizer
bottomup_recognizer.pl --- and we saw that it was very easy to grind
bottomup_recognizer.pl into the dust. For example, the following are all sentences admitted by the
jules believed the robber who shot the robber fell
jules believed the robber who shot the robber who shot marsellus fell
The bottom-up recognizer takes a long time on these examples. But the top-down program handles them without problems.
The following sentence is not admitted by the grammar, because the last word is spelled wrong (felll instead of fell).
jules believed the robber who shot marsellus felll
Unfortunately it takes
bottomup_recognizer.pl a long time to find that out, and hence to reject the sentence. The top-down program is far better.
|<< Prev||- Up -||Next >>|