Infix- to Postfix-Notation (RPN): 01. Introduction to the "in2po" library's implementation

Hello! So you would like to learn, how I was able to write this small library which takes an arithmetic formula in the infix notation and transforms it into the postfix notation.

To give you a very rough overview I performed the following steps, to get what I got:

Define the language of arithmetic expressions within the "labelled BNF" syntax. This is a general tool to define programming languages; it may look like an overkill to deal with arithmetic expressions by using concepts from the compiler construction theory, but it is not! That`s because the language of arithmetic expressions itself is not that easy as it looks like; we just get this impression since we`re so used to it. So to tackle such expressions we have to borrow from the concepts of compiler construction.
Now, apply the "BNF Converter" tool to this syntax; it will produce a lexer and a parser for the language we seek for. Along with the parser the classes to model the "abstract syntax tree" (AST) are also produced.
Since we have got the parser and the AST classes, we are now able to traverse the AST. This is done in the 'printer', which has also been produced by the "BNF Converter" tool. The standard printer produces a so called 'linearized view' of the AST: i.e. if we input an arithmetic expression in the infix notation we get the same expression again in the infix notation back. To change the output to the postfix notation, we just need to modify a few lines in the printer`s implementation (just change the visit order of the AST). Voila!

You may still ask yourself why it is necessary to go through such an hassle, just to produce an arithmetic expression in the postfix order; cannot we just use one of the stack based algorithms (e.g. the "Shunting yard algorithm" from Dijkstra) to solve this 'simple' problem?

Well, we could ... but we have to fulfill other constraints as well, which are probably harder to implement around a stack based algorithm: While transforming from the one notation to the other, we have to ensure things like:

The stack size of the postfix expressions should stay rather minimal; e.g. we do not wish expressions like "a b c + *" but we prefer "b c + a *".
Double, triple negations and the like should be handled with; well I cheated a little bit to solve this requirement. Instead of providing a solution which would work on the AST, I just run a preprocessor on the input which eliminates such nasty things.
Constant sub-expression should be evaluated, since we wish our output to be as simple as possible. Well, the actual implementation does not do everything humanly possible and there is still room to improve this part, but more on that later in this how to.
Function expression should be dealt with; this means that things like the expression "f(sin(a) + cos(b))" should be translated into "a sin b cos + f".

Well, you could still argue that a stack based implementation could solve those requirements; probably. But, with the actual approach by treating arithmetic expressions as a tiny programming language we gain generality; if you wish you could easily include further additions like a special power symbol "^" to deal with e.g. exponentiation. And as you will discover, it is really easy to adapt the labelled BNF syntax and to work through the mentioned steps one to three.

So, this was the introductory part; in the following, we will look together on the syntax of our arithmetic expressions; then I will explain you how to use the BNF Converter on this syntax and finally show you which changes you have to do to the printer to get outputs in the postfix notation.

This will be the minimal part; later on we will look at how the mentioned constraints have been implemented. To understand those parts we need to have a look at the "preprocessor", "unifier", "optimizer" ("simplifier") and last but not least the "evaluator'.

Infix- to Postfix-Notation (RPN)

Friday, November 30, 2007

01. Introduction to the "in2po" library's implementation

No comments:

References

About Me

Blog Archive