Final Project: Jack Compiler

Due Dec 14 (Sunday), before midnight

This project has two additional milestone requirements:

1) Milestone 1 Due: You must demo your Jack program in class/lab Dec 11

2) Milestone 2 Due: Compiler. Dec 12 (Friday)

The APIs in this assignment follow the design in your book The Elements of Computing Systems but with some differences that will allow us to reuse our work more easily across assignments.

On Dec 19th, your full final project is due:

  • Working Jack demo which minimally runs in the VMEmulator

  • Implemented Jack tokenizer, analyzer, and compiler

  • Resubmissions for fixed computer, assembler, and virtual machine code compilers

  • One-on-one demos of your working programs with Prof. Normoyle

  • (Extra credit): Your Jack demo running on your simulated computer

Update your repository

We will use the same repository as Assignment 1. The changes have already been merged into your repository.

$ cd cs240-f25-classwork
$ git pull

Your repository should include new files under the folder named Compiler.

1. Milestone 1

1) Milestone 1 Due: You must demo your Jack program in class Dec 11

1.1. Jack program

In the folder, MyDemo, write a cool Jack program that will entertain and delight your friends, families, and classmates. A cool program might be a

  • Game

  • Simulation

  • Data visualization

  • An interesting algorithm, such as one that implements encryption, optimization, or gene matching. You may use an algorithm or demo from a pervious course, such as data structures, computational biology, AI, or machine learning.

In the file, MyDemo/Readme.md, write any instructions that are needed to run your demo.

2. Milestone #2

2) Milestone 2 Due DecA12 (Friday) by midnight

2.1. Tokenizer

In the file, common/jacktokenizer.cpp, implement a tokenizer for the Jack language. To start, implement the following functions.

  • bool jack_check_symbol(const char* str, JackToken* token): If the given string starts with a symbol, fill the keyword, type, and value into the given token and return true. Otherwise, return false.

  • bool jack_check_keyword(const char* str, JackToken* token): If the given string starts with a keyword, fill the keyword, type, and value into the given token and return true. Otherwise, return false.

  • bool jack_check_int_constant(const char* str, JackToken* token): If the given string corresponds to a non-negative integer, fill the keyword, type, and value into the given token and return true. Otherwise, return false.

  • bool jack_check_string_constant(const char* str, JackToken* token): If the given string corresponds to a quotes string, fill the keyword, type, and value into the given token and return true. Otherwise, return false.

  • bool jack_check_identifier(const char* str, JackToken* token): If the given string corresponds to an identifier, fill the keyword, type, and value into the given token and return true. Otherwise, return false.

  • int JackTokenizer::findNextToken(int startIndex, int endIndex): Starting at startIndex, scans forward in mProgram skipping block comments and whitespace to find the next token. This function should also increment mLineNumber whenever we encounter a newline.

Run test_tokenizer to test.

$ ./test_tokenizer
.... snip
Tokenize mainclass:
  Check value = mainclass (got mainclass): PASSED
  Check keyword = 0 (got 0): PASSED
  Check type = 2 (got 2): PASSED
Token: 4, var, 0, 9
Token: 4, int, 0, 5
Token: 4, a, 2, 0
Token: 4, ;, 1, 0
Token: 5, let, 0, 12
Token: 5, a, 2, 0
Token: 5, =, 1, 0
Token: 5, 3, 3, 0
Token: 5, ;, 1, 0
Token: 5, do, 0, 13
Token: 5, Output, 2, 0
Token: 5, ., 1, 0
Token: 5, printString, 2, 0
Token: 5, (, 1, 0
Token: 5, Hello World, 4, 0
Token: 5, ), 1, 0
Token: 5, ;, 1, 0

Next, in common/jacktokenizer.cpp, implement the following function which outputs the tokenized output to an XML file.

  • bool jack_tokens_writeXML(const char* filename, const std::vector<JackToken>& tokens): Open the given filename for writing. If successful, outputs the given list of tokens as an xml file and returns true. Otherwise, returns false.

Run jack_tokenizer to test.

$ ./jack_analyzer jack/Hello/Main.jack

  class
  Main
  {
  function
  void
  main
  (
  )
  {
  do
  Output
  .
  printString
  (
  Hello World
  )
  ;
  do
  Output
  .
  println
  (
  )
  ;
  return
  ;
  }
  }

You are given a script to help you test.

$ ./test_jack_tokenizer.sh
==== Tokenize TestAdd.jack ====
Files jack/Tests/TestAdd-tokens.xml and jack/Tests/Answers/TestAdd-tokens.xml are identical
... etc

2.2. Analyzer

In the file, common/jackanalyzer.cpp, implement a parser for the Jack language that outputs the parsed language to an XML file. You should implement the following functions.

  • void JackAnalyzer::process(const hack::JackToken& token);: Outputs the current token using the indentatistored in the member variable mIndent and the FILE pointer stored in mOutputFile. The output format should be <tag> value </tag> where tag is the token type (e.g. identifier, integer/string constant, symbol, or keyword) and the value is the token string.

  • void JackAnalyzer::analyzeClass();: Recursively outputs the XML for a class.

  • void JackAnalyzer::analyzeClassVarDec();: Recursively outputs the XML for class variable declarations.

  • void JackAnalyzer::analyzeSubroutineDec();: Recursively outputs the XML for a subroutine declaration.

  • void JackAnalyzer::analyzeSubroutineBody();: Recursively outputs the XML for a subroutine body.

  • void JackAnalyzer::analyzeParameterList();: Recursively outputs the XML for a parameter list.

  • void JackAnalyzer::analyzeVarDec();: Recursively outputs the XML for a variable declaration in a function or method.

  • void JackAnalyzer::analyzeStatements();: Recursively outputs the XML for a list of statements.

  • void JackAnalyzer::analyzeDo();: Recursively outputs the XML for a do statement.

  • void JackAnalyzer::analyzeLet();: Recursively outputs the XML for a let statement.

  • void JackAnalyzer::analyzeWhile();: Recursively outputs the XML for a while statement.

  • void JackAnalyzer::analyzeReturn();: Recursively outputs the XML for a return statement.

  • void JackAnalyzer::analyzeIf();: Recursively outputs the XML for an if statement.

  • void JackAnalyzer::analyzeExpression();: Recursively outputs the XML for an expression.

  • void JackAnalyzer::analyzeTerm();: Recursively outputs the XML for a term.

  • void JackAnalyzer::analyzeExpressionList();: Recursively outputs the XML for an expression list.

$ ./jack_analyze jack/Hello/Main.jack

  class
  Main
  {
  
  
  
    void
    main
    (
    
    
    )
    {
    
...
Full sample output for the analysis of Main.jack is here (Main-analyze.xml)

Implementation advice:

  • The analyzer functions should implement the grammar described in your book, The Elements of Computing Systems.

  • The indentation level is a member variable in JackAnalyzer. Increase the indent before recursively printing and then increase it again on the other side.

  • The token list and current token index are member variables. Processing a token should increase the current token.

  • printf supports printing multiple spaces using the format "%*s".

For example, the code below shows how the class can be implemented

void JackAnalyzer::compileClass()
{
  fprintf(mOutputFile, "%*s", mIndent, "<class>\n");
  mIndent += 2;

  process(mTokens[mCurrentToken++]); // class
  process(mTokens[mCurrentToken++]); // name
  process(mTokens[mCurrentToken++]); // {
  analyzeClassVarDec();
  analyzeSubroutineDec();
  process(mTokens[mCurrentToken++]); // }

  mIndent -= 2;
  fprintf(mOutputFile, "%*s", mIndent, "</class>\n");
}

You are given a script to help you test.

$ ./test_jack_analyzer.sh
==== Analyze TestAdd.jack ====
Files jack/Tests/TestAdd-analyze.xml and jack/Tests/Answers/TestAdd-analyze.xml are identical
... etc

3. Final Project

Due December 14th (Sunday) by midnight

3.1. VM Writer

In the file, common/vmwriter.cpp, implement utilities for writing VM code to a file. You should implement the following functions.

  • void writePush(VM_Segment segment, int num);: Write a push segment num command to the file indicated by the member variable mOutputFile

  • void writePop(VM_Segment segment, int num);: Write a pop segment num command

  • void writeArithmetic(VM_Op op);: Write an operation based on the op (e.g. VM_ADD would output add)

  • void writeLabel(const char* label);: Write a label command

  • void writeGoto(const char* label);: Write a goto command to the given label

  • void writeIf(const char* label);: Write and if-goto command with the given label

  • void writeCall(const char* fnname, int nArgs);: Write a function call

  • void writeFunction(const char* fnname, int nLocals);: Write a function defition

  • void writeReturn();: Write a return statement

Although this is a simple class, you may want to put asserts into your functions to help you track down errors. Use the utility test_vmwriter to check your writer.

$ ./test_vmwriter
push that 0
push this 1
push static 4
push constant 111
pop pointer 0
pop local 2
pop temp 3
pop static 4
add
not
label L1
if-goto L1
call Exp 3
function Exp 1
return

3.2. Compiler

In the file, common/jackcompiler.cpp, implement a compiler for the Jack language that outputs VM code. The API is very similar to common/jackanalyzer.cpp and I recommend you use the analyzer as a starting point. The parser differs from the analyzer as following:

  • The process function now verifies that the current token has the correct type and value.

  • The parser contains two symbol tables, one for classes and one for subroutines. These tables will be modified in compileClassVarDec and compileVarDec respectively.

  • The parser contains a VMWriter object for writing a VM file.

Below is the list of functions to implement.

  • void JackCompiler::process(const hack::JackToken& token, TokenType expectedType, const char* expectedValue);: Checks that the current token has the expected type and value. If not, prints the current line and an error message.

  • void JackCompiler::compileClass();: Recursively compiles the VM code for a class.

  • void JackCompiler::compileClassVarDec();: Recursively compiles the VM code for class variable declarations.

  • void JackCompiler::compileSubroutineDec();: Recursively compiles the VM code for a subroutine declaration.

  • void JackCompiler::compileSubroutineBody();: Recursively compiles the VM code for a subroutine body.

  • void JackCompiler::compileParameterList();: Recursively compiles the VM code for a parameter list.

  • void JackCompiler::compileVarDec();: Recursively compiles the VM code for a variable declaration in a function or method.

  • void JackCompiler::compileStatements();: Recursively compiles the VM code for a list of statements.

  • void JackCompiler::compileDo();: Recursively compiles the VM code for a do statement.

  • void JackCompiler::compileLet();: Recursively compiles the VM code for a let statement.

  • void JackCompiler::compileWhile();: Recursively compiles the VM code for a while statement.

  • void JackCompiler::compileReturn();: Recursively compiles the VM code for a return statement.

  • void JackCompiler::compileIf();: Recursively compiles the VM code for an if statement.

  • void JackCompiler::compileExpression();: Recursively compiles the VM code for an expression.

  • void JackCompiler::compileTerm();: Recursively compiles the VM code for a term.

  • void JackCompiler::compileExpressionList();: Recursively compiles the VM code for an expression list.

$ ./jack_compiler jack/Hello/Main.jack
Outputting file: jack/Hello/Main.vm

You are given a script to help you test.

$ ./test_jack_compiler.sh
==== Compile TestAdd.jack ====
Files jack/Tests/TestAdd.vm and jack/Tests/Answers/TestAdd.vm are identical
... etc
You may have a slightly different output that still produces correct behavior. However, for this assignment, match the generated output to make testing easier.

3.2.1. Test

Develop your compiler in steps. Below is a recommended order.

1) Add code to fill the class and subroutine symbol tables in compilerClassVarDec and compileVarDec respectively. You can use test_symtable to test your work.

$ ./test_symtable
Writing file jack/Tests/TestSymTable.vm
Symbol global0
  Expected int (Got int): PASSED
  Expected kind 4 (Got 4): PASSED
  Expected number 0 (Got 0): PASSED
Symbol global1
  Expected boolean (Got boolean): PASSED
  Expected kind 4 (Got 4): PASSED
  Expected number 1 (Got 1): PASSED
... etc

2) Extend compileReturn, compileSubroutineDec, and compileVarDec to support function definitions. Check your work with jack/Tests/TestLet.vm. You should have something like the following.

$ ./jack_compiler jack/Tests/TestLet.jack
$ cat jack/Tests/TestLet.vm
function TestLet.main 1
push constant 0
return

3) Extend compileLet and compileTerm to support assignments for non-array types. Test again with TestLet.jack.

$ ./jack_compiler jack/Tests/TestLet.jack
$ cat jack/Tests/TestLet.vm
function TestLet.main 1
push constant 3
pop local 0
push constant 0
return

4) Implement the remaining terms in compileTerm with the exception of function calls and array types. You should now be able to compile TestBool.jack, TestNull.jack, TestString.jack

5) Implement compileExpressions. You should not be able to compile TestAdd.jack, TestSub.jack, and TestMultiply.jack.

6) Implement compileIf. You should be able to compile TestIf.jack

7) Implement compileWhile. You should be able to compile TestWhile.jack

8) Implement function calls. You should be able to compile TestFunction.jack

9) Implement constructor and method calls. You should be able to compile TestClass.jack

10) Complete compileTerm and compileLet to support arrays. You should be able to compile TestArray.jack

3.3. Resubmissions

You may also submit bug fixes for the following assignments for partial credit:

3.4. (Extra Credit) Simulated Computer

Compile your demo to machine code using your implemented compiler and run it on your simulated computer from assignment A03. For full credit, demo

Submit your Work

Push you work to Github to submit your work.

$ cd Compiler
$ git add .
$ git commit -m "Compiler complete"
$ git push