The compiler is a vital component in the programming process that translates high-level language to machine code. This comprehensive guide to compiler design begins by introducing students to the compiler and its functions. It then explains in detail each phase of compiler design – lexical, syntax and semantic analysis, code generation and optimisation. It clarifies important internal processes such as storage management, the symbol table and parallel compiling. It also describes various error handling techniques and provides an overview of the open-source compiler construction tools currently available. Salient Features
Manoj B Chandak is Professor and Head of the Department of Computer Science and Engineering at Shri Ramdeobaba College of Engineering and Management, in Nagpur, Maharashtra. He has more than 25 years of teaching experience and has published more than 100 papers in national and international journals. He also guides PhD and PG research scholars.
Khushboo P Khurana is Assistant Professor at the Department of Computer Science and Engineering at Shri Ramdeobaba College of Engineering and Management, in Nagpur, Maharashtra. She is an M.Tech gold medalist and teaches compiler design at undergraduate level.
Preface Acknowledgements 1. Introduction to Compilers Objectives 1.1 Introduction 1.2 History of Compiler 1.2.1 C Compiler 1.2.2 Java Compiler 1.3 Language Processing System 1.3.1 Preprocessor 1.3.2 Assembler 1.3.3 Linker 1.3.4 Loader 1.4 Interpreter and Compiler 1.4.1 Interpreter 1.4.2 Compiler 1.4.3 Interpreter vs Compiler 1.5 Phases of Compiler 1.5.1 Lexical Analysis 1.5.2 Syntax Analysis 1.5.3 Semantic Analysis 1.5.4 Intermediate Code Generation 1.5.5 Code Optimisation 1.5.6 Code Generation 1.5.7 Symbol Table and Error Handling 1.6 Overview of Compiler Design 1.6.1 Pass and Phase 1.7 Application of Techniques Used in Compiler Design 1.8 Tombstone Diagram 1.9 Cross-compiler and Bootstrapping 1.10 Relating Compilation Phases with Formal Systems 1.11 Compiler Construction Tools Summary Exercises Review Questions Practice Questions 2. Lexical Analysis Objectives 2.1 Lexical Analysis and Tokens 2.2 Input Buffering 2.2.1 Two-buffer Input Scheme Using Buffer Pair 2.3 Separation of Lexical Analysis and Syntax Analysis 2.4 Specification of Tokens 2.5 Tokens, Patterns and Lexemes 2.6 Attributes for Tokens 2.7 Lexical Analyser and Symbol Table 2.8 Regular Expression 2.9 Transition Diagram 2.10 Recognition of Tokens 2.11 Lexical Errors 2.12 NFA and DFA for Lexical Analysis 2.12.1 Combining NFAs of Different Patterns into a Single NFA 2.13 Regular Expression to NFA 2.14 NFA to DFA Conversion 2.15 Regular Expression to DFA 2.15.1 Direct Method for Conversion of RE to DFA 2.16 DFA Minimisation Summary Exercises Review Questions Practice Questions Programming Questions 3. Syntax Analysis Objectives 3.1 Introduction to Syntax Analysis 3.2 Types of Grammars 3.3 Context-free Grammar 3.4 Derivation 3.5 Ambiguous Grammar 3.6 Top-down Parsing 3.6.1 Non-predictive Parsing 3.6.2 Predictive Top-down Parsing 3.7 LL(1) – Predictive Top-down Parser 3.8 Bottom-up Parsing 3.8.1 Handle and Viable Prefix 3.8.2 Shift-reduce Parsers 3.9 Simple LR Parser (SLR) 3.10 Canonical LR Parser (CLR) 3.11 LALR Parser 3.12 Ambiguous Grammars in LR Parsing 3.13 Parser Conflicts 3.14 Handling Ambiguous Grammars in LALR Parser 3.15 Selection of Parsing Algorithm 3.16 Comparison of LR Parsers 3.17 Applications of the LR Parser 3.18 LR Parser in Natural Language Processing Summary Exercises Review Questions Practice Questions Programming Questions 4. Syntax-directed Translation and Semantic Analysis Objectives 4.1 Introduction to Semantic Analysis and Type Checking 4.2 Attributes and Specification of Semantic Rules 4.3 Syntax-directed Definition 4.4 Syntax-directed Translation Scheme 4.5 Dependency Graph 4.6 Methods to Calculate Semantic Rules 4.7 Evaluation Order of Semantic Rules 4.8 Syntax Trees 4.9 Directed Acyclic Graph Construction 4.10 Type and Type Checking 4.10.1 Basic Types and Constructed Types 4.10.2 Type System and Type Checker 4.10.3 Type Expressions 4.10.4 Operator and Function Overloading 4.10.5 Static and Dynamic Typing 4.10.6 Encoding of Type Expressions 4.11 A Simple Type Checking System 4.12 Type Conversion 4.12.1 Coercion 4.13 Type Equivalence Summary Exercises Review Questions Practice Questions Programming Questions 5. Intermediate Code Generation Using Syntax-directed Translations Objectives 5.1 Introduction to Intermediate Code Generation 5.2 Intermediate Code Representation 5.3 Syntax-directed Translation into Three-address Code – Principle 5.4 Syntax-directed Translation of Assignment Statement into Three-address Code 5.4.1 Assignment Statement 5.5 Syntax-directed Translation to TAC for Boolean Expressions 5.5.1 Method 1: Numerical Representation 5.5.2 Method 2: Short Circuit Code 5.6 Syntax-directed Translation into TAC for Programming Language Control Structures Using Backpatching 5.6.1 If-then Construct 5.6.2 If-then-else Construct 5.6.3 While Loop 5.6.4 Do-while Loop 5.6.5 For Loop 5.6.6 Repeat-Until 5.7 Syntax-directed Translation of Arrays to TAC 5.8 Syntax-directed Translation to TAC for the Switch-Case Statement 5.9 Syntax-directed Translation to TAC for Procedures 5.10 Syntax-directed Translation to TAC for Declarations Summary Exercises Review Questions Practice Questions 6. Code Optimisation Objectives 6.1 Introduction 6.2 Machine-independent Optimisation 6.2.1 Common Sub-expression Elimination 6.2.2 Algebraic Simplification and Strength Reduction 6.2.3 Dead Code Elimination 6.2.4 Copy Propagation 6.2.5 Constant Propagation 6.2.6 Constant Folding 6.2.7 Function Inlining and Cloning 6.2.8 Loop Optimisation 6.3 Machine-dependent Optimisation 6.3.1 Peephole Optimisation Summary Exercises Review Questions Practice Questions Programming Questions 7. Code Generation Objectives 7.1 Introduction to Code Generation 7.2 Problems in Code Generator Design 7.3 Target Machine 7.4 Instruction Cost 7.5 Simple Code Generation 7.5.1 Code Generation for a Three-address Statement of the form X = Y op Z 7.5.2 Reordering of Statements for Improved Cost of Generated Code 7.5.3 Code Generation for a Three-address Statement of the Form X = Y 7.5.4 Generating Code for Array Assignment and Pointer 7.6 Code Generation Using DAG 7.6.1 Heuristic for Finding the Optimal Order of Nodes in DAG 7.6.2 Labelling Algorithm 7.6.3 Code Generation From a Labelled Tree 7.7 Register Allocation 7.7.1 Global Register Allocation 7.7.2 Register Assignment for an Outer Loop 7.7.3 Register Allocation by Graph Colouring 7.8 Code Generation by Dynamic Programming Summary Exercises Review Questions Practice Questions Programming Questions 8. Storage Management and Symbol Table Objectives 8.1 Introduction to the Run-time Environment 8.2 Storage Organisation 8.3 Activation of Procedures 8.3.1 Activation Tree 8.3.2 Control Stack 8.3.3 Activation Record 8.4 Scope of Declaration 8.5 Storage Allocation Strategies 8.5.1 Static Storage Allocation 8.5.2 Stack Storage Allocation 8.5.3 Heap Storage Allocation 8.6 Garbage Collection 8.7 Parameter Passing 8.8 Symbol Table 8.9 Data Structure for Symbol Table 8.9.1 Linear List 8.9.2 Search Tree: Binary Search Tree 8.9.3 Hash Table 8.10 Scope of Information in Symbol Table Summary Exercises Review Questions Practice Questions Programming Questions 9. Error Handling Objectives 9.1 Introduction to Error Generation and Error Handling 9.2 Error Handling in Each Phase of Compilation 9.3 Error Handling in Lexical Analysis Phase 9.4 Error Handling in Syntax Analysis Phase 9.4.1 Error Recovery in Predictive LL Parser 9.4.2 Errors in Shift-Reduce Parsers (LR Parser) 9.5 Error Handling in Semantic Analysis Phase Summary Exercises Review Questions Practice Questions Programming Questions 10. Compiler Construction Tools Objectives 10.1 Introduction to Open Source Compiler Construction Tools 10.2 Java Formal Languages and Automata Package (JFLAP) 10.3 Scanner Generator: Lex 10.3.1 Lex Source File 10.3.2 Lex Regular Expressions 10.3.3 Ambiguous Source Rules 10.4 Parser Generator: Yet Another Compiler-Compiler (Yacc) 10.5 JFlex and CUP 10.6 ANTLR 10.7 Other Tools and Techniques Summary Exercises Programming Questions 11. Functional Languages Objectives 11.1 Introduction to Functional Languages 11.2 Characteristics of Functional Languages 11.3 Concepts in Functional Languages 11.3.1 First Class Functions 11.3.2 Pure Functions 11.3.3 Higher-order Functions 11.3.4 Closure 11.3.5 Currying 11.3.6 Lazy Evaluation 11.3.7 Recursion 11.4 Introduction to Lambda Notation/?-calculus 11.4.1 Free and Bound Variables 11.4.2 Substitution 11.4.3 Arithmetic Computations 11.4.4 Recursion 11.5 Basic Compilation 11.6 Polymorphic Type Checking 11.7 Desugaring Summary Exercises Review Questions 12. Parallel Compilers and Scheduling Objectives 12.1 Parallel Programming Concepts 12.2 Processes and Threads 12.3 Shared Memory and Message Passing 12.4 NVCC for Parallel Compilation 12.5 Dependency 12.6 Iteration Spaces 12.7 Affine Array Indexes Summary Exercises Review Questions 13. Programs on Compiler Design Objectives 13.1 Regular Expressions in Python 13.2 Programs for Different Phases of the Compiler Exercises Practice Questions 14. Solved Gate Questions Lexical Analysis Parsing SDTS Code Generation and Optimisation Memory Allocation