1.4 FSAs in PROLOG

Now, we will see how to implement FSAs in Prolog. This is actually a misleading way to describe what we are going to do. For, although we have been talking about FSMs as machines, we are going to treat them as passive data structures that are manipulated by other programs. In a way, the very first pictures of FSAs in this chapter showed a similar division into a declarative part and a procedural part: the network is a declarative representation of the FSA and the ``egg'' is the program. Depending on what instructions we give to the ``egg'', we can use it to generate words (as in the picture) or to recognize words using the same FSA.

Seperating declarative and procedural information is often a good idea, because, in many cases, it makes understanding what is going on in a program, maintaining a program, and reusing information easier. In the chapter on ATNs, we will see more about why the machine oriented view is unattractive.

1.4.1 Representing FSAs in PROLOG

We will use three predicates to represent FSAs:

initial/1
final/1
arc/3

initial(1), for instance, says that 1 is an initial state, final(4) says that 4 is a final state, and arc(1,2,h) says that there is a h transition from state 1 to state 2. We will, furthermore, use the atom '#' to mark jump arcs.

Our first laughing machine will therefore have the following PROLOG representation.

initial(1). final(4). arc(1,2,h). arc(2,3,a). arc(3,4,!). arc(3,2,h).

Here is the machine with the jump arc:

initial(1). final(4). arc(1,2,h). arc(2,3,a). arc(3,4,!). arc(3,1,'#').

And here the non-deterministic version:

initial(1). final(4). arc(1,2,h). arc(2,3,a). arc(3,4,!). arc(2,1,a).

As you can see, the PROLOG representation of these finite state machines is a straightforward translation of the graphs that we have been drawing in the previous sections.

1.4.2 A Recognizer and Generator for FSAs without Jump Arcs

Now, that we know how to represent FSAs, we would of course like to do something with them; i.e. we want to use them to generate and recognize strings. That is, we need programs to manipulate those FSA representations. Those programs should be general enough, so that we don't have to know anything about the structure of a certain FSA before using it -- in particular, we would like these programs to be able to deal with deterministic as well as non-deterministic FSAs. Prolog helps us a lot with this point, because it's built-in backtracking mechanism provides us with the search tool that we need to deal with the non-determinism. Furthermore, Prolog is so declarative, that one and the same program can (up to a point) work as both a recognizer and a generator.

Let's first ignore the fact that there may be jump arcs and write a recognizer/generator for FSAs without jump arcs. We will define the predicate recognize1/2 which takes the number of the node you want to start from as first argument and a list of symbols representing the string that you want to recognize as second argument. The query recognize1(Node,SymbolList) should succeed, if the list of symbols SymbolList can be recognized by the FSA in Prolog's database starting from node Node and ending in a final state.

We will define recognize1/2 as a recursive predicate, which first tries to find a transition from state Node to some other state reading the first symbol of the list SymbolList and then calls itself with the node it can reach through this transition and the tail of SymbolList.

In the base case, recognize1/2 is called with an empty list, i.e. the whole input has been read. In this case, it succeeds if the Node is a final state:

recognize1(Node,[]) :- final(Node).

In the case where SymbolsList is not empty, we first retrieve an arc starting from Node from the database. Then we take this transition, thereby reading a symbol of the input, and recursively call recognize1/2 again.

recognize1(Node1,String) :- arc(Node1,Node2,Label), traverse1(Label,String,NewString), recognize1(Node2,NewString).

The predicate traverse1/3 checks that we can indeed take this transition, i.e. that the label of the arc is the same as the first symbol of the input list, and returns the input without the symbol that we have just read.

traverse1(Label,[Label|Symbols],Symbols).

Here is the whole program:

recognize1(Node,[]) :- final(Node). recognize1(Node1,String) :- arc(Node1,Node2,Label), traverse1(Label,String,NewString), recognize1(Node2,NewString). traverse1(Label,[Label|Symbols],Symbols).

Now, if Prolog should ever retrieve an arc from the database that later turns out to be a bad choice because it doesn't lead to success, it will (automatically) backtrack and look for alternatives.

As promised, we can use recognize1/2 in two different modes. In the recognition mode, we want to give a list of symbols SymbolList and want to find out whether there is an initial node Node such that the query recognize1(Node,SymbolList) returns yes. Here is a driver predicate for the recognition mode:

test1(Symbols) :- initial(Node), recognize1(Node,Symbols).

In the generation mode, we want to get all lists of symbols which recognize1/2 can generate starting from some initial node. For this, we can just call test1/1 with an uninstantiated variable. test1/1 then selects an initial node and calls recognize1/2 with this node as first argument and an uninstantiated second argument.

generate1(X) :- test1(X).

1.4.3 A Recognizer and Generator for FSAs with Jump Arcs

It is very easy to adapt the recognizer/generator of the previous section to be able to deal with jump arcs. All we have to do is to specify that if an arc is labelled with '#', we can make a transition without doing anything to the input string. So, all we have to change is the definition of the traverse predicate as follows.

recognize2(Node,[]) :- final(Node). recognize2(Node1,String) :- arc(Node1,Node2,Label), traverse2(Label,String,NewString), recognize2(Node2,NewString). traverse2('#',String,String). traverse2(Label,[Label|Symbols],Symbols).