15.1 Parsing vs. Generation

In the last chapters we have examined various parsing algorithms. So, we have a pretty good idea now of how to go about building a program that analyzes the syntactic structure of some natural language sentence and constructs a representation of its meaning. This is an important component in any system that has to "understand" natural language. Starting from the semantic representation the system can try to figure out what are the actions that the user wants it to do next. After the system has carried out these actions, it usually has to give some kind of answer of feedback to the user. And if it accepts natural language input its responses should preferably be in natural language as well. That means, the system not only has to analyze and understand natural language it also has to be able to produce natural language; it has to generate natural language.

Here, we will only look at a subtask of natural language generation: we will assume that we have a semantic representation which we want to express as an English sentence. That is, the decision which is the content that has to be expressed and how this content has to be distributed over sentence has already been made. This subtask is often called surface realization. For example, if we have the semantic representation $\exists x(woman(x)\ \&\ dance(x))$ we want to know which English sentences could be used to express this; such as a woman dances, for instance.

In a sense, this task is the opposite to parsing. In parsing, we have a sentence and a grammar as input and want to know what the syntactic structure and/or the semantics representation of the sentence is. In generation, we have a semantic representation and a grammar as input and want to find sentences that correspond to the semantic representation. We will see in the next section how this can be done.