CSC467F Compilers and Interpreters | Fall 2005 |
The good news is, you will not be using any new and unfamiliar tools in this and the next phase of the project. The bad news is that for this phase and the next you will have to do a lot of good old-fashioned C coding and debugging.
The requirements for phase 3 are:
The starter code is available for download from this web site in a single file, starter3.tar.gz. Once the file is downloaded into your working directory you can use gunzip and tar to extract the source files; these are:
The provided ast.c, ast.h, semantic.c, semantic.h, symbol.c, and symbol.h have practically no functionality, and are included here for consistency with declarations in scanner.l and parser.y. The code here (with common.h and globalvars.c) should compile as is into a working program which implements the requirements of phase 2 (no default output, -Tn, -Tp, and -Ds working).
In this phase of the project, the parser.y, ast.c, ast.h, semantic.c, semantic.h, symbol.c, and symbol.h must be modified. You are welcome to use any other code/header files you deem useful and/or necessary, and modify main.c, common.h, and globalvars.c as you see fit. Any such changes should be reflected by appropriate modifications to your makefile.
The most important part of this phase, upon which everything else hinges, is the creation of the AST. This will require making suitable data declarations in the file ast.h (which should be included in your parser.y and scanner.l files). The file ast.h should have the data definitions for the different kinds of nodes your parse tree will contain, as well as function declarations for code in ast.c that needs to be externally visible. ast.c will contain code for subfunctions that aid in the construction of the tree, the function PrintTree() which dumps the AST in the required format, and whatever routines you require for cleanup after the compiler is finished with the AST.
Modifications required in parser.y are:
Since the life of the AST will be longer than any single function that it comes in contact with, one of the last actions of the parser should be to assign the root of the finished tree to a global pointer somewhere. This pointer can be in globalvars.c, ast.c, or anywhere else convenient, as long as it is made visible to other modules that have to see it (for example, the code generator).
The PrintTree() routine in ast.c outputs a text
representation of the AST to the globally defined *FILE
dumpFile. If the -Da switch is set in the compiler (and
no errors are detected by the parser or semantic modules) main
will call PrintTree(); hence no explicit checking whether
the dumpAST global flag is set needs to be done by your routine.
See the web page giving the output specifications
for PrintTree() for details about the required format. These specs
are also worth a look for ideas about the kinds of data structures you will
need.
General advice about AST structure: you should try to strike a balance in the number of typed data structures you use in building the tree. Certainly a different struct for every rule or symbol in the grammar is too much, but trying to represent everything using only one or two kinds of data structures will not work very well.
As well, we will be implementing the semantic analysis in this phase. This is the last error checking we will do in the compiler, so any source program that makes it past this point should compile into a runnable program (though whether it crashes or not after it has started running is another issue). The semantic checking should be done in a separate pass, initiated by a single function call once the entire parse tree has been built. The full semantic analysis will require a fair-sized set of mutually recursive functions to do properly and efficiently. For error reporting you are welcome to use any format and as much information as you please, but you should try to make the format consistent with the error reporting from the other phases of the project. On finding a semantic error you should set the global errorOccurred flag and terminate all activity by backing out of all recursive calls with an appropriate return value.
In order to do the semantic analysis you will need a symbol table module. This module will also be used by the code generator, so its functionality should be fairly generic i.e. not too hand-in-glove with the semantic checking routines. It is not important that the symbol table be very efficient; do not use a hash table unless your group has a lot of free time on its hands. While "dump symbol table" functionality (-Dy) is not a mandatory part of this phase, if you do decide to implement it for debugging purposes please make it conditional on the -Dy switch being set in the command line (this will decrease the chance that your debugging output will contaminate my test runs).
See the semantic checks web page for more detailed information about the semantic routines.
Since this phase will be very involved and labour intensive, your group should start work on it as soon as you are able. Also, progress should be made on different fronts whenever possible; a good approach would be to have at least one member working on the symbol table module while the others completed the AST construction actions in the parser file, and then have the group split into two components to work on the semantic module and the PrintTree() function concurrently.
The documentation for phase 3 should include a clear description of the various data structures your AST is built from, an overview of your implementation of the semantic analysis, your testing strategy for this phase of the compiler (but no test runs), and a listing of any changes you have made to the grammar rules (not actions) in the original parser. This last should include a mention of whether you made use of the supplied parser and/or scanner or not.
Electronic versions of your documentation and all source code files (including e.g. globalvars.c and common.h whether they are modified or not) should be tarred and gzipped together and submitted via ECF's submit facility as assignment 3 by midnight of the due date by one member of your group. If you experience any problems with submission, notify me at fvb@cs) as soon as possible.