bracket is a compiler for a small, statically-typed, Racket-like language. It translates source code into LLVM Intermediate Representation (IR), which can then be compiled into a native executable.
This project is an implementation of the LFun language, as described in Jeremy Siek's book, "Essentials of Compilation: An Incremental Approach" (EoC). For a more in-depth understanding of the language's design and semantics please refer to the relevant sections of the book and the corresponding code (there may be minor differences).
Warning
The implementation is not an exact replica of behavior mentioned in EoC. For instance, there is no short circuiting of logical operators, nor is there garbage collection for tuples. Refer to the examples in tests and the source code to fully understand the behavior.
Note
This is part of the course project for the Compilers course at IIITH as part of a team with @GnanaPrakashSG2004. This README was partially generated with an LLM.
- Syntax
- Language Features
- Example Program
- Prerequisites
- Installation and Building
- Usage
- Running Tests
- Project Structure
See LFun in EoC Section 7.1
type ::= Integer
exp ::= int | (read) | (- exp) | (+ exp exp) | (- exp exp)
exp ::= var | (let ([var exp]) exp)
type ::= Boolean
bool ::= #t | #f
cmp ::= eq? | < | <= | > | >=
exp ::= bool | (and exp exp) | (or exp exp) | (not exp) | (cmp exp exp) | (if exp exp exp)
type ::= Void
exp ::= (set! var exp) | (begin exp* exp) | (while exp exp) | (void)
type ::= (Vector type*)
exp ::= (vector exp*) | (vector-length exp) | (vector-ref exp int) | (vector-set! exp int exp)
type ::= (type... -> type)
exp ::= (exp exp...)
def ::= (define (var [var:type]...) : type exp)
LFun ::= def... exp
The language enforces static type checking. All variable bindings, function parameters, and return values must have their types explicitly declared or inferrable at compile time. This helps catch errors at compile time before the program is run. Comments are not supported. A statement within parentheses evaluates to and returns a value (except for: the set!, void, define and while expressions which do not return anything, and the begin expression, which returns the value of the last corresponding exp). The last expression value is printed to stdout.
The program
(+ 1 2)
prints 3.
Local variables are introduced using let expressions and are visible within the corresponding let block. A let binding shadows any existing variables with the same name within its scope.
The program
(let ([x 10])
(let ([x 20])
x))
evaluates to 20.
The read expression can be used to get a single int/bool user input from stdin. The program
(let ([x (read)]) (+ x 10))
with an input of 32 evaluates to and prints 42.
Boolean values for true and false are written as #t and #f, respectively.
- Conditional Logic: The
(if e1 e2 e3)expression evaluatese1. If the result is#t, it evaluatese2; otherwise, it evaluatese3. - Logical Operators: The language provides
and,or, andnot, which behave according to standard propositional logic. - Comparison Operators: Integers can be compared using
eq?,<,<=,>, and>=. Theeq?operator is also used for checking pointer equality on vectors.
The program
(if (and #t #f) 1 0)
prints 0.
The program
(let ([x 5])
(let ([result (if (> x 3)
(let ([x 10]) (+ x 5))
(let ([x 2]) (- x 1)))])
result))
prints 15.
The programs
(if 5 1 0)
and
(let ([x 5])
(if x 1 0))
throw an error because the condition being evaluated (e1) must return a boolean, and both branches (e2 and e3 must be of the same type).
The language includes several forms for managing state and control flow.
-
The
VoidType: TheVoidtype and the literal(void)expression represent the absence of a useful value. They are used for expressions that are evaluated purely for their side effects, such asset!andwhile. -
set!: The(set! var exp)expression mutates (changes) the value of a previously defined variable. Theset!expression itself evaluates tovoid. The type checker requires that the type ofexpmust match the existing type ofvar. -
begin: The(begin exp1 exp2 ...)expression allows for sequencing. It evaluates each of its sub-expressions in order, from left to right. The result of the entirebeginblock is the result of its final sub-expression. -
while: The(while cond-exp body-exp)expression provides a looping construct. It repeatedly evaluatescond-exp. If the result is#t, it executesbody-expand then repeats the process. The loop terminates whencond-expevaluates to#f. The result of awhileloop is alwaysvoid, and the type checker requires thatcond-expmust be of typeBoolean.
This example uses all four features to loop five times and return the final value of the counter as 5.
(let ([i 0])
(begin
(while (< i 5)
(set! i (+ i 1)))
i))
The language supports vectors, which are used as fixed-size, heterogeneous tuples.
(vector exp*): Creates a new vector containing the evaluated results of the expressions. Does not return a value.(vector-ref vec-exp index): Retrieves the element at the given integerindex.(vector-set! vec-exp index val-exp): Modifies the element atindexto a new value. Does not return a value.(vector-length vec-exp): Returns the number of elements in the vector.
Aliasing and Shallow Copies: When a vector is bound to a variable or passed to a function, a shallow copy is performed. This means multiple variables can point to the exact same vector in memory. The eq? operator checks for this aliasing (pointer equality), not for deep structural equality.
In this example, t1 and t2 are aliases for the same vector, while t3 is a different vector with identical content. The program correctly identifies this and returns 42.
(let ([t1 (vector 3 7)])
(let ([t2 t1])
(let ([t3 (vector 3 7)])
(if (and (eq? t1 t2) (not (eq? t1 t3)))
42
0))))
- Global Scope: All
defined functions exist in a global scope and are visible throughout the entire program. The order of function definitions does not matter. - First-Class Citizens: Functions can be passed as arguments, returned from other functions, and stored in data structures. The type syntax for a function is
(type1 ... -> typeR). - No Lexical Scoping: A key limitation compared to Racket is that functions are not lexically scoped. The only external variables a function can access are other globally defined functions. Function definitions cannot be nested.
(define (sum [n : Integer]) : Integer
(if (eq? n 0)
0
(+ n (sum (- n 1)))))
(sum 5)
prints 15.
(define (neg [b : Boolean]) : Boolean
(not b))
(define (get-neg [x : Integer]) : (Boolean -> Boolean)
neg)
(let ([f (get-neg 1)]) (f #f))
prints #t.
(define (add-one [x : Integer]) : Integer
(+ x 1))
(define (compose [f : (Integer -> Integer)] [g : (Integer -> Integer)] [x : Integer]) : Integer
(f (g x)))
(let ([y 10])
(compose add-one add-one y))
prints 12.
The following program defines a map function that applies another function inc to each element of a vector. This demonstrates the use of first-class functions. The final expression returns 42 which is printed to stdout.
(define (map [f : (Integer -> Integer)] [v : (Vector Integer Integer)])
: (Vector Integer Integer)
(vector (f (vector-ref v 0)) (f (vector-ref v 1))))
(define (inc [x : Integer]) : Integer
(+ x 1))
(vector-ref (map inc (vector 0 41)) 1)
Before building bracket, you need:
- LLVM and Clang (version 19 or newer is recommended)
- CMake
- Ninja build system
- Python (for running the test suite)
Warning
The main branch requires rtti (run time type information for dynamic casting of objects). If you want to compile without rtti, switch to the stage4 branch.
The easiest way to install the dependencies on macOS is with Homebrew. See the LLVM Github and the LLVM Getting Started page on installing LLVM.
Homebrew (and possibly your distro's package manager) installs LLVM and clang in a non-standard path to avoid conflicts with the system's tools. You need to export some environment variables so that CMake can find the correct LLVM installation. The following are the ones I used, ymmv.
export LDFLAGS="-L/opt/homebrew/opt/llvm/lib/clang/19 -L/opt/homebrew/opt/llvm/lib/c++ -L/opt/homebrew/opt/llvm/lib/unwind -lunwind"
export PATH="/opt/homebrew/opt/llvm/bin:$PATH"Use the included make_build.sh script, replacing these exports or build it yourself by creating a build directory and running CMake from the project root.
mkdir build
cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang \
-DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++ \
-DLLVM_BIN_DIR=/opt/homebrew/opt/llvm/bin \
-S . -B buildNote
NOTE: On Linux, add -DLLVM_DIR=<your_llvm_dir> to the cmake command instead of the -DLLVM_BIN_DIR line.
Once the project is configured, run the build command from the project root:
cmake --build ./buildThe llracket binary will be located in ./build/bin/.
To compile a bracket source file (e.g., program.rkt) into LLVM IR, run the compiler and specify an output file.
./build/bin/llracket program.rkt -o program.llIf you omit the -o flag, the output will be printed to standard output.
The generated LLVM IR depends on a small C runtime for I/O operations like read. To create a standalone executable, you must compile the .ll file and link it with runtime.c.
-
Compile LLVM IR to an object file:
llc -filetype=obj program.ll -o program.o
-
Link the object file with the runtime to create an executable:
clang program.o ./tools/runtime/runtime.c -o my_program
-
Run your compiled program:
./my_program
The project includes a test suite that you can run using ninja.
-
Make sure you have built the project as described above.
-
Navigate into the
builddirectory and runninja check.cd build ninja check
You can also use the provided helper script from the project root, which rebuilds and runs the tests:
bash ./run_tests.shThe codebase is organized as:
include/: header files for the compiler librarieslib/: The source code for the compiler's phases (Lexer, Parser, Sema, CodeGen) and some common utils (Basic).tools/:driver/: The source for thellracketcommand-line executable.runtime/: The C runtime library required by compiled programs.
tests/: Test files and scripts for verifying the compiler's correctness. To add a new test, add a<name>.rktfile with the program, with a<name>.rkt.infor stdin input (optional),<name>.rkt.outfor expected output or<name>.rkt.errif an error is expected.make_build.sh: Creates a build directory and builds the compiler executable.run_tests.sh: Runsmake_build.shand runs all tests.