I just wanted to share a little experience with generating an AS3 parser using antlr and python.
I was trying first to create the parser using GNU Flex and Bison in C, probably the best way for a very performancing parser.
Yeah, that’s right.. but looking at the antlr syntax I realized that’s easier and easier.
Moreover I start using this very useful eclipse plugin for antlr debugging which made my life easier!
The grammar file I created is a compromise between the asdt grammar file and the ECMA-262 grammar specification.
Once finished working on my eclipse project I’ve managed to parse succesfully all the adobe corelibs files using this java test file:
package org.sepy.core.parsers.as3;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import antlr.CommonAST;
import antlr.RecognitionException;
import antlr.TokenStreamException;
public class Application {
public static void main(String argv[])
{
if(argv.length > 0)
{
File file = new File(argv[0]);
if(file.exists())
{
FileInputStream is = null;
try {
is = new FileInputStream(file);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
AS3Lexer L = new AS3Lexer(is);
AS3Parser P = new AS3Parser(L);
try {
P.compilationUnit();
} catch (RecognitionException e) {
// TODO Auto-generated catch block
System.out.println(" line=" + e.line + ", column="+ e.column);
System.out.println(e.getMessage());
e.printStackTrace(System.err);
} catch (TokenStreamException e) {
// TODO Auto-generated catch block
System.out.println(" line=" + L.getLine() + ", column="+ L.getColumn());
System.out.println(L.getGuessInfo());
System.out.println(e.getMessage());
e.printStackTrace(System.err);
}
CommonAST.setVerboseStringConversion(false, P.getTokenNames());
CommonAST ast = (CommonAST) P.getAST();
System.out.println("Tree:");
System.out.println(ast.toStringTree());
}
}
}
}
Ok, done that I decided to export the grammar file for python (thanks to antrl python export feature).
Everything works fine also for python, but I realized that the python script were so much slower than the java one!
import sys
import antlr
import AS3Parser
import AS3Lexer
L = AS3Lexer.Lexer(filename);
P = AS3Parser.Parser(L);
P.setFilename(filename)
try:
P.compilationUnit();
ast = P.getAST();
except:
pass
On a 75Kb actionscript file the python script took about 7 seconds to run, while the java application only 2 seconds. I know python interpreter caould be slower than many other languages, but I never thought so much slower.
So I run the python hotshot profiler to see which could be the bottleneck in the python script and I found most of the problems were due to unuseless antlr (the python module) method’s calls.
After making corrections to the antlr.py file the same script took exactly half of the time. Now 3 seconds. Wow 🙂
But not fast enough.
So I enabled for the antlr python script psyco module and this time the same script took only 1.6 seconds.
Now the python script is fast enough, even if I’m sure I can make more optimizations in the antlr module…