UofT ECE 467 (2022 Fall) - Continuing Your Compiler After Lab 4/The Course (for your own interest)



The website and starter code is being pre-emptively updated!

I don't know if I'll be teaching the course again, but if I wait until next year I'll forget feedback I've received. Note that more test cases have been added (if you pull the new starter code); those are for the (potential) future!


Improving the Lexer

If you wish to continue building your compiler, I recommend git pulling the latest changes I’ve made to the starter code. I’ve added a (more) proper way to deal with type vs. identifier tokens in the lexer/parser. Basically, there is a shared dictionary between the lexer and parser, which indicates to the lexer which identifiers are actually types. For example, the dictionary is initially populated with primitive types (int, float, void).

You can first try to support typedef statements in global scope. Next, consider what data structure you would replaced the shared dictionary with to support typedefs in block scopes. For example, the typedef int foo; is only applicable inside the function in the following code.

int main() {
	typedef int foo;
	foo f = 3;
	return f;
}

Straightforward Additions

The following things should be straightforward to implement. If they feel tedious, take some time to think about the structure of your code; it could probably be refactored to make things easier (if you’re unsure, feel free to reach out to me).

Struct Definitions

This requires quite a bit of extra bookkeeping. Here are some useful references:

Pointer Types

In general, care needs to be taken during lexing/parsing to handle the following ambiguity.

a * b; // pointer variable declaration or multiplication

In our case, we already have a dictionary that will tell us whether a is a type or not (if a is not a type, treat it as an identifier).

The parser needs to be augmented to recognize types that are not just “strings”. You probably want to create something similar to your AST node definitions, but for types.

Additionally, you will want to add support for casting between different pointer types.

Arrays

With experience with structs, and support for pointers, you should be able to add support for arrays. Do not try to allocate arrays on the stack; define a malloc (or similar) function in runtime.cpp that returns a heap-allocated pointer (and a corresponding free function). You may wish to use char* as a return value, as opposed to supporting void* in your type system.

Global Variables

Hmmm…

String Literals

These would be stored in global data.

And more…

Very impressive if you get here!


Last updated: 2022-12-23 09:56:36 -0500.