PDA

View Full Version : Creating an ActionScript parser...



IQAndreas
December 20th, 2009, 05:38 PM
I always said I was going to do it, and now I finally will.

Just something small for a little project I had in mind. There are a few hinders I'm not sure how efficient they will be, but I'll ask those questions when I get there if I can't get past them.


My first question is, has this ever been done before? Are there any already existing AS3 interpreters written in AS3? Are there any interpreters written in other languages that perhaps have a development guide or thoughts on what to keep in mind to avoid as much rewriting as possible?

wvxvw
December 20th, 2009, 06:17 PM
d.Eval - although, they aren't open-source.
poonyScript - well, it's not AS3, it's another VB-like language written in AS3. (they aren't opensource either).
There are sources of ASC in the Adobe repository :) That's open-source and even documented! (somewhat).
HaXe templates may achieve similar purposes, not very robust though...
Ah, and there is JSC - that's not precisely AS3, but it compiles ECMAScript-like languages, so, may be useful too.

Is it OK to ask why do you need this?

IQAndreas
December 20th, 2009, 08:23 PM
Is it OK to ask why do you need this?
For teaching ActionScript. :)

I want users to be able to have an area where on the right they see a list of available classes for this lesson, the name of stage items, helper functions, or whatever they need at that time. On the left side they have a textField where they type in the ActionScript and hit "Run" to test the code, and they can see their creation. Then they are encouraged to mess around with the code a bit and see if they get different results.

It doesn't need speed or performance, just a way to visually and quickly let them see the change in results by what they change in the code.

Also, in addition to just having TextFields with code, some code they will only be able to change values in, like little "fill in the blank" areas or drop down lists or numeric steppers on prewritten code. Just some way to teach them to look at code as individual bits and pieces that all contribute a little in their own way.


I'm hoping to finally release several interactive ActionScript tutorials as well as release a framework to allow others to write tutorials.


It shouldn't be too difficult. I just need to be able to pick out and separate variables from functions and properties and relay those commands to whatever Class they refer to.

The biggest problem I can think of is if there are errors in the code. Preferably, the error messages should be similar to the ones given by Flash. The problem is also tracking down which line is giving the error, since many errors may not be noticeable until several lines down, primarily "bracket errors". Perhaps it would simplify alot if, as opposed to Flash, all lines are REQUIRED to end with a semicolon.

Ugh... I'll just worry about that part when I get there... [/2am babble]

TheCanadian
December 21st, 2009, 01:25 AM
Start learning regex.

wvxvw
December 21st, 2009, 04:05 AM
Considering the lexer only:
http://code.google.com/p/e4xu/source/browse/trunk/src/org/wvxvws/parsers/CodeParser.as
Here's something I've done some time ago... It has few annoying bugs + it cannot handle inline XMLs. The purpose was also somewhat different - it's creating colorized HTML text to display that in browser.

And this is basically the same thing:
http://code.google.com/p/e4xu/source/browse/#svn/trunk/src/sharp/ExportHTML
I've ported to C# and made it a FlashDevelop plugin... However, all the bugs and inability to parse inline XMLs is there too :)

And sorry to sound like a bore - but, RegExp will definitely help, but it alone won't solve it - AS3 has a very complex syntax in terms of parsing... To be honest, it has the most complex syntax of all languages that I know...

TheCanadian
December 21st, 2009, 04:20 AM
Well obviously there's more to it than a single regex but since compilers basically take a giant string and turn into something readable by the VM, I can't think of a better place to start than with parsing strings. I also can't think of a better way of parsing strings than regex.

Krilnon
December 21st, 2009, 01:59 PM
Regular expressions are typically only used for building scanners (or lexers, depending on who you ask). True regular expressions can't even capture a context-free grammar (and most programming languages are context-free) because they can only describe regular languages. Even though the so-called regular expressions in programming language libraries are usually more powerful than true regular expressions, the syntax doesn't really lend itself to describing programming language syntax in general.

But using regular expressions is definitely a great place to start because compilers usually start with the scanning!

TheCanadian
December 21st, 2009, 03:29 PM
I'm interested what you would do.

glosrfc
December 21st, 2009, 03:43 PM
My first question is, has this ever been done before?
http://wonderfl.net/
http://wonderfl.net/about

Krilnon
December 21st, 2009, 04:21 PM
I would use the extant ActionScript compiler that Adobe wrote in Java. If it had to be in AS3 for some reason, I would rewrite the entire compiler. IqAndreas said this is "just something small for a little project," but I don't think that it's feasible to write something "small" that still correctly parses ActionScript. Like wvxvw said, it's a big language.

IqAndreas said that this is supposed to be part of a learning tool for others, so, in my opinion, it would terrible to teach people with a tool that didn't correctly understand the language that it was trying to teach.

You wouldn't really need much if any of the back end of the compiler for IqAndreas' project because he's not generating a SWF at the end (or if he is, hopefully that would be done with Adobe's ASC). The scope of what he's trying to accomplish is a little vague, like when he says:
It shouldn't be too difficult. I just need to be able to pick out and separate variables from functions and properties and relay those commands to whatever Class they refer to.

To me, that sounds a bit naive if he plans to support the entire language, because it's not particularly "easy" to pick out the class that defines a particular method. For instance, you could have a couple classes with the same name that are in separate namespaces, but the user has specified an open namespace in a level of lexical nesting that is several levels above the current scope, or whatever. It's also not particularly easy to do informative syntax error recovery, which is another thing that he mentioned that he wanted to do. You'd need a pretty accurate parser to do the kind of recovery that people expect. It'd need to be complete, too, because you wouldn't want to generate errors for valid syntax that you were too lazy to support. How would anyone learn a language from that?

That's why I'd just use Adobe's implementation or spend a long time writing my own, and it'd be huge. It's not like one dude sat down one day and wrote ASC in his spare time. The ECMAScript 262 v3 grammar, which is simpler than the one used in AS3/ES4, is something like 17 pages long. That's the just grammar, which you'd usually just feed into a parser or parser generator. Your code would have to figure out the semantics of the 17 pages of grammar productions and generate everything that IqAndreas wants for his program.

Anyway, there's basically a whole branch of computer science that deals with this stuff, so it's kind of amusing to see IqAndreas say "It shouldn't be too difficult" when plenty of pretty smart people have spent ages doing this sort of stuff. I think some of the things that he wants are more difficult to do than they appear at first glance.

TheCanadian
December 21st, 2009, 04:38 PM
I absolutely agree, but it seems that all he's trying to do is teach how to write a function or how to use a loop.

wvxvw
December 21st, 2009, 04:50 PM
Well, if I would think of a simplest language to let someone get a basic understanding by not developing complex tools - that would be Python or Lua... Their syntax is much easier + there are countless implementations of Python in other languages, (C# Python, Java Python).
Or... I don't know, maybe Alpha Basic or Q-Basic? :D I think that's what I started with.
There are other languages which are even simpler to parse, but they are hard to digest for human reader...

Krilnon
December 21st, 2009, 05:40 PM
I absolutely agree, but it seems that all he's trying to do is teach how to write a function or how to use a loop.
Yeah, but you can put a lot of different syntax into the body of a function. Like I said in my last post, IqAdreas' post was a bit vague in terms of what he was looking to support within his application. However, he did mention some things that would be impractical to do without building a real parser anyway.

IQAndreas
December 21st, 2009, 06:38 PM
http://wonderfl.net/
http://wonderfl.net/about
Finally! That site has been on my mind, but I forgot the name of it. Thank you. :) So, does that site compile the SWF on the server side, or is it created each time it is requested? The later might be slower and more strenuous for the server, so I'm guessing it's the first option.


Wow... Quite a bit of replies. Wasn't expecting that.

I actually didn't have in mind to support the entire language, but, as you said, some users might start trying out all of Flash's features, which might get some annoyed since it would not be possible.

I don't want all classes to be open to users so they can create ANYTHING. The point is not to let them create things in Flash, but to teach them only the part you want them to learn right now. All available classes are added by the "tutorial writer" something like this:

//The constant TIMELINE treats all code as if it were on frame 1 of the timeline,
//not needing any "package" or "class" declarations.
var example2:ASCode = new ASCode(ASCode.TIMELINE);

example2.addClass("com.greensock.TweenLite", TweenLite);

//The code can now reference "mc1" or "this.mc1" without needing to
//create a new instance or import anything.
example2.addStageItem("mc1", new CircleMC());

example2.code = code_txt_input.text;

if (example2.hasErrors)
{
example2.listErrors();
}
else
{
example2.showDialog(); //Executes the code and shows the results in a new window
}
The writer tells the "ASCode" class all classes which will be used, and instead of interpreting all lines inside of that class, the parser will go ahead and go directly to that class and run the code.

It may be a bit of work, but all the code really needs to do is keep a list of all variables, functions, and instances. Then, if a line like this appears:

TweenLite.to(mc1, 3, {alpha:1})
The ASCode class will then dig down the "TweenLite" class, determine that the static function "to()" is being run, and pass in the arguments, replacing the text "mc1" with the instance stored in the variables/instances list. Am I clear on what I am trying to do?

Since classes passed in to the code will be the same as classes that are running inside the tutorial SWF, a major problem I can already see now is if the user changes static variables or any part of classes which will then affect how the rest of the tutorial code runs. Eek... That could be fatal. :hangover:

Hm... I don't want the parser to interpret all classes, only the code the user has typed in... Guess I need some more thinking...