After my first attempts with ANTLR scanners in python/java I decided to start back with Bison/Flex again to see the difference in performances.
So first I need to wrote from scratch the grammar/lexer files using only the ECMAScript 4 specifications and much patience (the elastic grammar file help me a lot too).
After finishing a first version of the parser I tested it on the same file (75Kb actionscript file) which both java and python parsed in more than 1 second.
The result was unbelievable: 0.02 seconds for that file!
Then I tested it on multiple files, and for about 320 files of the whole adobe corelib library it took 220ms
Ok, the parser it’s not yet complete and doesn’t care about regexp and xml syntax, but its performance convinced me enough…
Now, the next step is to finish and test the parser and finally create a python library using pyrex, then a benchmark test again.
If someone is interested in testing the parser, download it (use “parser –help” form the command line for usage help), but remember this is only a first test.. not really helpful right now (I just wanted to share my text/parsing experiences).