Formal Grammars and Notation You will need to be familiar with the notion of a formal grammar, and in particular need to be able to work with a specification of a new language detailed in this document.
If the assignment is difficult to read, we suggest reviewing the foundational concepts.
The starter code for the lexer sits in a file named "lexer.lex", and the starter code for the parser
sits in a file called "parser.yy".
The Lexer
Your lexer must be able to recognize the following kinds of tokens in addition to all the keywords and operators listed above:
integer constants consisting of one or more digits
string constants wrapped in double quotes and supporting the following escaped characters: \\ \" \n \t
None constant, equivalent to "NULL" in Java
'true' and 'false'
Name identifiers that start with a letter or underscore, followed by sequence of letters, underscores and numbers. so x0 is a valid variable name, but 0x is not
You need to make sure that all arithmetic operators are left associative, so (w+x+y+z) should be parsed as (((w+x)+y)+z)
AST
Your parser must produce an AST with nodes for the following program constructs:
Block ::= [Statement] ;;
Global ::= name ;;
Assignment ::= LHS Expression ;;
ExpressionStatement ::= Expression ';' ;;
IfStatement ::= Condition ThenPart ElsePart ;;
WhileLoop ::= Condition Body ;;
Return ::= Expression ;;
;;
FunctionDeclaration ::= [Arguments] Body ;;
BinaryExpression ::= LeftOperand Operator RightOperand ;;
UnaryExpression ::= Operand Operator ;;
FieldDereference ::= BaseExpression Field ;;
IndexExpression ::= BaseExpression Index ;;
Call ::= TargetExpression [Arguments] ;;
Record ::= Map[String, Expression] ;;
IntegerConstant ;;
StringConstant ;;
NoneConstant ;;
Deliverables
Your parser should read a program from stdin, parse it into an AST and pretty print the AST to stdout.
Specifically, you must:
Write the parser and lexer
Define all the types for your AST nodes
Define a visitor interface
Define a PrettyPrinter class that extends the visitor interface and pretty prints the AST
We don't care that much about how well your output is formatted. One strict requirement is that your pretty printer should wrap all binary and unary expressions in parenthesis. For unary expressions, the operator should be outside the parenthesis, so for example !x should be pretty printed to !(x).
You should provide 3 tests named test1.mit, test2.mit and test3.mit that your parser can parse correctly. Your tests should provide good coverage of the constructs in the language.
Implementation notes
For this assignment, you do not have to worry about reclaiming memory (we will implement garbage collection in a later assignment).
Questions or comments regarding 6.035? Send e-mail to the TAs at
6.035-staff@mit.edu.