Top Down or Bottom Up for Expression Evaluation ?

Last time I wrote about Top Down parsing, a technique for parsing that can be easilly implemented manually using function recursion.

Today it is time to talk about one of the other ways of parsing, called Bottom Up parsing. This technique try to find the most important units first and then, based on a language grammar and a set of rules, try to infer higher order structures starting from them. Bottom Up parsers are the common output of Parser Generators (you can find a good comparison of parser generators here) as they are easier to generate automatically then recursive parser because they can be implemented using sets of tables and simple state based machines.

Writing down manually a bottom up parser is quite a tedious task and require quite a lot of time; moreover, this kind of parsers are difficult to maintain and are usually not really expandable or portable. This is why for my tests I decided to port bYacc (a parser generator that usually generates C code) and edit it so it generates ActionScript code starting from Yacc-compatible input grammars. Having this kind of tool makes things a lot easier, because maintaining a grammar (that usually is composed by a few lines) is less time expensive than working with the generated code (that usually is many lines long).
I will not release today the port because actually I had not time to make sure it is bugfree and I’ve only a working version for Mac, but I plan to release it shortly if you will ever need it for your own tasks. My goal for today was to compare the speed of the parser I wrote with an automatically generated bottom up parser, to see which is the faster approach.

I created a bottom up parser which is able to execute the same expressions accepted by the expression evaluator I wrote last time. There are anyways some differences that – as you will probably and hopefully understand in the future – that make those parsers really different. Some will be disussed shortly here.

To do that I created a Yacc grammar and some support classes.
The parser grammar is really simple and really readable:
%{
%}
%token NUMBER SYMBOL
%left '+' '-'
%left '*' '/'
%left NEG
%%
program
: expr { Vars.result = $1; }
;
expr
: expr '+' expr { $$ = $1 + $3; }
| expr '-' expr { $$ = $1 - $3; }
| expr '*' expr { $$ = $1 * $3; }
| expr '/' expr { $$ = $1 / $3; }
| '-' expr %prec NEG { $$ = -$2; }
| '(' expr ')' { $$ = $2; }
| SYMBOL '(' expr ')' {
if( Vars.SYMBOL_TABLE[ $1 ] )
{
$$ = Vars.SYMBOL_TABLE[ $1 ]( $3 );
} else
{
trace( "can't find function" );
}
}
| SYMBOL {
if( Vars.SYMBOL_TABLE[ $1 ] )
{
$$ = Vars.SYMBOL_TABLE[ $1 ];
} else
{
trace( "can't find symbol" );
}
}
| NUMBER { $$ = yyval; }
;
%%

Continue reading the extended entry to see the results. Continue reading