Skip to content

Latest commit

 

History

History
248 lines (192 loc) · 5.43 KB

File metadata and controls

248 lines (192 loc) · 5.43 KB

SpecterCC Architecture

Overview

SpecterCC is a functioning C++ compiler implementation that follows a classic multi-stage compilation pipeline. The compiler is built with clean architecture principles, professional code organization, and modern C++17 standards.

Compilation Pipeline

Source Code (.spc)
    ↓
[ Lexical Analysis ]
    ↓
Token Stream
    ↓
[ Syntax Analysis ]
    ↓
Abstract Syntax Tree (AST)
    ↓
[ Semantic Analysis ]
    ↓
Annotated AST
    ↓
[ Code Generation ]
    ↓
x86-64 Assembly (.s)
    ↓
[ Assembler & Linker ]
    ↓
Executable Binary

Components

1. Lexer (Lexical Analyzer)

Location: include/lexer/, src/lexer/

Responsibilities:

  • Tokenize source code into a stream of tokens
  • Handle keywords, identifiers, literals, operators, and delimiters
  • Track source locations for error reporting
  • Skip whitespace and comments

Key Classes:

  • Lexer: Main lexer class that processes source code
  • Token: Represents a single token with type, lexeme, and location
  • TokenType: Enumeration of all token types

2. Parser (Syntax Analyzer)

Location: include/parser/, src/parser/

Responsibilities:

  • Parse token stream into an Abstract Syntax Tree (AST)
  • Implement recursive descent parsing with operator precedence
  • Handle grammar rules and syntax errors
  • Error recovery and synchronization

Key Classes:

  • Parser: Implements recursive descent parser
  • Grammar rules for expressions, statements, and declarations

Operator Precedence (highest to lowest):

  1. Primary (literals, identifiers, parentheses)
  2. Unary (-, !)
  3. Multiplicative (*, /, %)
  4. Additive (+, -)
  5. Relational (<, <=, >, >=)
  6. Equality (==, !=)
  7. Logical AND (&&)
  8. Logical OR (||)
  9. Assignment (=)

3. AST (Abstract Syntax Tree)

Location: include/ast/, src/ast/

Responsibilities:

  • Represent program structure as a tree
  • Provide base classes for all AST nodes
  • Support type information attachment

Key Classes:

  • ASTNode: Base class for all nodes
  • Expression: Base for expressions (literals, binary ops, calls, etc.)
  • Statement: Base for statements (if, while, return, etc.)
  • FunctionDecl: Function declarations with parameters and body
  • Program: Root node containing all functions

4. Semantic Analyzer

Location: include/semantic/, src/semantic/

Responsibilities:

  • Type checking and inference
  • Symbol table management with scoping
  • Undeclared identifier detection
  • Function signature verification

Key Classes:

  • SemanticAnalyzer: Main analyzer with scope management
  • Symbol: Represents variables and functions in symbol table

Features:

  • Nested scope support
  • Type checking for expressions and assignments
  • Function parameter count validation

5. Code Generator

Location: include/codegen/, src/codegen/

Responsibilities:

  • Generate x86-64 assembly code
  • Manage stack frames and calling conventions
  • Implement control flow
  • Handle variable storage and parameter passing

Key Classes:

  • CodeGenerator: Generates assembly from AST

Code Generation Details:

  • Calling Convention: Stack-based parameter passing
  • Register Usage:
    • rax: Return values, expression results
    • rcx: Temporary for binary operations
    • rbp: Frame pointer
    • rsp: Stack pointer
    • rdi: Exit code for syscall
  • Stack Layout:
    rbp+N:  Parameters (N = 16, 24, 32, ...)
    rbp+8:  Return address
    rbp+0:  Saved frame pointer
    rbp-8:  First local variable
    rbp-16: Second local variable
    ...
    

6. Driver

Location: src/main.cpp

Responsibilities:

  • Command-line argument parsing
  • File I/O
  • Pipeline orchestration
  • Error reporting

Build System

Tool: CMake 3.15+

Structure:

  • C++17 standard
  • Warning flags enabled for GCC/Clang
  • Modular compilation units
  • Single executable target: spectercc

Supported Language Features

Types

  • int: 32-bit signed integer
  • float: Single precision floating point
  • double: Double precision floating point
  • char: Character type
  • bool: Boolean type
  • void: Void type (for functions)

Operators

Arithmetic: +, -, *, /, %

Comparison: ==, !=, <, <=, >, >=

Logical: &&, ||, !

Assignment: =

Control Flow

If Statement:

if (condition) {
    statements
} else {
    statements
}

While Loop:

while (condition) {
    statements
}

Functions

Declaration:

return_type function_name(type param1, type param2) {
    statements
}

Features:

  • Parameters passed by value
  • Return statement required for non-void functions
  • Function calls with arguments

Variables

Declaration:

type variable_name;
type variable_name = initializer;

Assignment:

variable = expression;

Error Handling

The compiler provides detailed error messages with:

  • Source file name
  • Line number
  • Column number
  • Descriptive error message

Errors are reported at multiple stages:

  1. Lexer: Invalid tokens, unterminated strings
  2. Parser: Syntax errors, unexpected tokens
  3. Semantic: Type mismatches, undeclared identifiers

Testing

All test programs are in examples/:

  • simple.spc: Basic variable assignment
  • arithmetic.spc: Arithmetic operations and function calls
  • conditional.spc: If/else statements
  • loop.spc: While loops and factorial
  • test_param.spc: Parameter passing verification

Run ./demo.sh to test all examples.