← Back to Projects

C Compiler with Control Flow Structures

C GCC Recursive Descent Parser Compiler Design Linux
C Compiler architecture and parser design
Compiler execution and output
Parser grammar and syntax trees
Compiler optimization techniques

Project Overview

A feature-rich C compiler built from scratch using recursive descent parsing techniques. This project successfully implements a complete parsing and execution engine for a subset of the C language, with full support for control flow structures (if-else, while, for, do-while, switch-case) and comprehensive operator precedence handling.

The compiler demonstrates deep understanding of compiler theory, including tokenization, syntax analysis, semantic analysis, and runtime execution. It processes source code files and executes programs with correct control flow behavior.

Key Features

  • Control Flow Statements: if-else, while loops, for loops, do-while loops, and switch-case statements
  • Operators: All comparison operators (==, !=, <, >, <=, >=), logical operators (&&, ||, !), and arithmetic operators (+, -, *, /, %)
  • Variables & Declarations: Variable declarations with initialization, symbol table management, and declaration tracking
  • Expression Evaluation: Full operator precedence with 7-level precedence hierarchy
  • Semantic Analysis: Undeclared variable detection, type checking, and error reporting
  • Break Statement Support: Proper break handling in loops and switch statements with correct control flow
  • Nested Structures: Full support for arbitrarily nested control structures
  • Large Program Support: Scalable token (5000) and symbol (500) limits for comprehensive programs

Technical Implementation

The compiler is implemented in approximately 1000 lines of C code using a recursive descent parser architecture. The tokenizer converts source text into tokens, the parser builds an abstract representation, and the semantic analyzer validates correctness.

Parsing Architecture

The parser uses recursive descent with proper operator precedence handling through a multi-level expression parsing hierarchy:

  • Expression → LogicalOr (lowest precedence)
  • LogicalOr → LogicalAnd
  • LogicalAnd → Comparison
  • Comparison → Additive
  • Additive → Term
  • Term → Factor (highest precedence)

Control Flow Implementation

Control flow statements are implemented with proper scoping and execution semantics:

  • While Loops: Condition re-evaluation with state preservation across iterations
  • For Loops: Three-part for(init; condition; increment) syntax with loop management
  • If-Else Statements: Conditional branching with nested structure support
  • Switch-Case Statements: Case matching with proper break handling to prevent fall-through
  • Break Statement: Correct termination of loops and switch blocks

Grammar

Program       → Statement*
Statement     → Declaration | Assignment | IfStatement | WhileStatement | ForStatement 
                | SwitchStatement | BreakStatement | Block | PrintStatement
Declaration   → 'int' Identifier '=' Expression ';'
IfStatement   → 'if' '(' Expression ')' Block ('else' Block)?
WhileStatement → 'while' '(' Expression ')' Block
ForStatement  → 'for' '(' Assignment ';' Expression ';' Assignment ')' Block
SwitchStatement → 'switch' '(' Expression ')' '{' (CaseLabel | DefaultLabel)* '}'
CaseLabel     → 'case' Number ':' Statement* 'break' ';'
DefaultLabel  → 'default' ':' Statement* 'break' ';'
Expression    → LogicalOr ((TOKEN_ASSIGN) LogicalOr)*
LogicalOr     → LogicalAnd (('||') LogicalAnd)*
LogicalAnd    → Comparison (('&&') Comparison)*
Comparison    → Additive (CompOp Additive)*
Additive      → Term (AddOp Term)*
Term          → Factor (MulOp Factor)*
Factor        → [UnaryOp] (Primary)

Challenges & Solutions

Challenge 1: While Loop Execution - Initial implementation only executed the loop body once. Solution: Implemented proper loop re-evaluation by saving the condition position, evaluating it repeatedly, and executing the body until the condition becomes false.

Challenge 2: Break Statement in Switch - Break statements were not preventing fall-through to subsequent cases and default blocks. Solution: Added a `switchDone` flag that properly exits the switch statement after a break is executed in a matched case, preventing any further case evaluation.

Challenge 3: Operator Precedence - Ensuring correct evaluation order for complex expressions. Solution: Implemented a formal 7-level precedence hierarchy through separate parsing functions, with highest-precedence operations at the deepest recursion level.

Challenge 4: Variable Scoping - Managing variable declarations and avoiding reuse across statement blocks. Solution: Implemented symbol table management with initialization tracking and declaration validation.

Results

The compiler successfully executes comprehensive programs with all control flow features. Created a 252-line test file demonstrating:

  • Arithmetic expressions with correct precendence (e.g., `r = p + q * 2`)
  • All comparison operators in conditional statements
  • Nested if-else structures with multiple conditions
  • Multiple while loops with various patterns and loop counters
  • Switch-case statements with multiple cases and default behavior
  • Fibonacci sequence calculation via loop iteration
  • Complex sum and factorial computations
  • Logical operator combinations in compound conditions

The test file compiles without errors and produces 41+ correct output values demonstrating all features working in concert.

File Structure

  • parser.c (984 lines) - Main compiler implementation with tokenizer, parser, and semantic analyzer
  • test_simple_comprehensive.txt (252 lines) - Comprehensive test demonstrating all features
  • test_while.txt, test_for.txt, test_switch.txt, etc. - Individual feature test files
  • README.md - Complete documentation of compiler features and usage

Key Statistics

  • Size: 984 lines of C code
  • Token Types Supported: 40+ tokens including keywords, operators, and punctuation
  • Parsing Functions: 15+ specialized parsing functions with proper separation of concerns
  • Expression Precedence Levels: 7 levels with correct evaluation semantics
  • Token Capacity: 5000 tokens (increased from 1000 for large program support)
  • Symbol Capacity: 500 symbols (increased from 100)
  • Test Coverage: 20+ test files covering individual features and comprehensive scenarios

Usage

$ gcc -o parser parser.c -Wall
$ ./parser program.txt

Future Enhancements

  • Add function definitions and calls with parameter passing
  • Implement arrays and pointer operations
  • Add standard library functions (printf with format strings, strlen, etc.)
  • Support for struct and typedef declarations
  • Implement proper memory management and stack frame handling
  • Add optimization passes and intermediate code generation
  • Support for preprocessing directives (#include, #define)
  • Debugger integration with breakpoint support