| CSC467F Compilers and Interpreters | Fall 2005 |
In phase 2 you are required build a parser recognizing the grammar of the CSC467 source language, and implement the "trace parser" functionality (-Tp switch) of the compiler, as given in the compiler man page. At this stage no semantic analysis is done, nor is any permanent data structure built.
The starter code is available for download from this web site in a single file, starter2.tar.gz. Once the file is downloaded into your working directory you can use gunzip and tar to extract the source files; these are:
The starter code from phase 1 (including common.h and globalvars.c) is still available from this web site. main.c and makefile are slightly modified from the phase 1 versions; you can edit the changes into your own versions if you prefer. The only files you need to modify are the parser.y file, and the scanner.l file if you chose to use your own rather than the provided one.
The starter code (with common.h and globalvars.c) will compile into a working program that has all required phase 1 functionality. It has no default output unless an error is detected; this should remain the default behaviour when phase 2 is complete.
The source language grammar as given contains a some elements that are essentially semantic in nature; in purely formal terms some of the non-terminals and productions are redundant. For example, both constants and variables can be expressions, but since they both reduce to identifiers, only one is needed to recognize a syntactically (as opposed to semantically) correct use of an identifier in an expression. Furthermore, if both are included in a yacc style parsing file certain conflicts will be introduced. Hence in the parser file not every production in the grammar handout will be implemented. On the other hand, in order to make the grammar LALR, some other productions will have to be introduced; these productions should not change the language in essence, but they are necessary if we are to have a non-ambiguous derivation of the parse tree.
In this phase you are not required to actually build the parse tree yet, but you should try to implement the productions in such a way that in phase 3 no new rules etc. need to be added. As well:
When the -Tp (trace parsing) switch is activated in the compiler, the global variable traceParser is set to "TRUE" (1). Trace output should be sent to the globally visible FILE * variable traceFile. At this stage, if the input is valid no other action has to be taken other than that specified above.
The format for the trace output should be taken verbatim from the file productions.txt. This file consists of lines of the form:
LHS -> RHSThe trace output should be given at the point when your rule has finished implementing the given production. Important note: not every rule in your grammar or even the source grammar will have a corresponding string in the productions.txt file; in particular, rules used to disambiguate the grammar or collect lists of constructs are not there and should not show up in the trace output. As well, since reporting will only be done after the construct has been fully parsed, trace output for all subconstructs will precede that for the containing construct. Finally, since (as mentioned above) determining whether an unmodified identifier in an expression is a variable or constant is impossible until semantic checking is implemented, all unmodified identifiers in expressions will be assumed to be variables (this does not effect syntactic correctness).
An error reporting function, yyerror(), is given in the starter code for parser.y. This function should not be modified; it is called automatically by the parser if there is no rule matching the current input tokens. Hence for this phase you do not need to implement any explicit error checking; the quality of your parser will determine if all syntactic errors are found or not.
The documentation should focus on general grammar issues (i.e. strategies you used to get rid of parser conflicts), and testing strategy (for the most part I want to see if there was some methodical effort made to verify that the parser caught all syntactic errors, and those only). Electronic versions of all your source code (modified or not) plus your documentation should be tarred and gzipped together and submitted via ECF's submit facility as assignment 2 by midnight of the due date by one member of your group (Note: starter code for phase 3 will be made available the following day). If you experience any problems with submission, notify me at fvb@cs) as soon as possible.