7.4 How Well Have we Done?

How good is this parser? Well, it's pretty easy to understand --- but performance-wise it's awful. Our English grammar is pretty basic --- but the query

?- recognize_bottomup([mia,knew,vincent,shot,marsellus]).

will make Sicstus Prolog hesitate, and

?- recognize_bottomup([jules,believed,the,robber,who,shot,marsellus,fell]).

is painfully slow.

In fact, this parser is so bad, it's useful --- it acts as a grim warning of how badly things can be done. We'll see in later lectures that we can do a lot better. Moreover, it's useful to try and understand why it is so awful. For it is bad in two quite distinct ways: at the level of implementation and at the level of algorithm.

7.4.1 Implementation

By this stage of your Prolog career, it should be quite clear to you why this recognizer is badly implemented. We made heavy use of append/3 to subdivide lists into the required pattern. This is very inefficient. If you examine a trace, you will see the the program spends most of its time trying out different ways of putting lists together. The time it devotes to the key recognition steps is comparatively small.

It's worth emphasizing that this implementation inefficiency has nothing to do with the basic idea of bottom up recognition/parsing. For a start, there are many nice Prolog implementations of fairly simple bottom up parsers --- in particular, what are known as shift-reduce parsers --- that are much better than this. Moreover, next week we will be discussing naive top down parsing/recognition. If this is implemented using append/3, the result would be just as inefficient as what we have see today. But next week we are going to take care to avoid this implementation inefficiency. We'll be using difference lists instead, and as we'll see, the result is a lot faster.

7.4.2 Algorithmic Problems

There is, however, a deeper problem. What we have discussed today is a naive bottom up recognizer --- but its naivete; has nothing to do with its implementation. As I just said, next week, we're going to be studying naive top down parsing, and we're going to take better care of our implementation, and the results will be much better. Nonetheless, the work we do next week will in a very important sense, be naive too.

In what deeper sense is our bottom up recognizer inefficient? It has an inefficiency you will find in many different kinds of parsers/recognizers (top down, left corner ...) namely The algorithm needlessly repeats work.

Consider the sentence ``The robber knew Vincent shot Marsellus'' As we have already mentioned, this sentence is locally ambiguous. In particular, the first part of the string, ``The robber knew Vincent'' is a sentence. Now, our naive bottom up algorithm will find that this string is a sentence --- and then discover that there are more words in the input. Hence, in this case, s is not the right analysis. Thus our parser will backtrack and it will undo and forget all the useful work it has done. For example, while showing that ``The robber knew Vincent'' is a sentence, the parser has to show that ``The robber'' is an np. Great! This is dead right! And we need to know this to get the correct analysis! But then, when it backtracks, it undoes all this work. So when it finally hits on the right way of analyzing ``The robber knew Vincent shot Marsellus'', it will once again demonstrate that ``The robber'' is an NP. If lots of backtracking is going on, it may end up showing this simple fact over and over and over again.

It is for this reason that I call the algorithm described above `naive'. The poor implementation is easily corrected --- next weeks top down parser won't have it --- but this deeper problem is harder to solve.

But the problem can be solved. We can perform non-naive parsing by using charts. In their simplest form, charts are essentially a record of what pieces of information we found and where (for example, that the initial string ``The robber'' is an np). Sophisticated parsing algorithms use the chart as a look-up table: they check whether they have already produced an analysis of a chunk of input. This saves them having to repeat needless work.

And the use of charts really does pay off: naive algorithms are exponential in the size of their input --- which, roughly speaking, means they may take an impossibly long time to run no matter how good the implementation is. The use of charts, however, turns context free parsing into a polynomial problem. That is, roughly speaking, it turns the problem of finding a parse into a problem that can be solved in reasonable time, for any CFG grammar.

We will study the idea of chart parsing later in the course: it is an idea of fundamental importance. But note well: chart parsing is not an idea we need to study instead of the various naive parsing strategies --- its something we need to study in addition to them. The various naive strategies (bottom up, top down, left corner,...) are importantly different, and we need to understand them. What is nice about chart parsing is that it is a general idea that can remove the naivete from all these approaches.