diff --git a/textbook/01-overview.md b/textbook/01-overview.md index 16c6b8d..342876d 100644 --- a/textbook/01-overview.md +++ b/textbook/01-overview.md @@ -1,6 +1,3 @@ - -\pagebreak - -Introduction -============ - -## Overview +#Overview + +##1.1 Introduction +###1.1.1 Definition of a Compiler +###1.1.2 History and Purpose +####1.1.2.1 Grace Hopper +####1.1.2.2 Purpose +#####1.1.2.2.1 Translate Source Language to Target Language +#####1.1.2.2.2 Object Code and Executables +#####1.1.2.2.3 Platform Independent Compilers +###1.1.3 Comparison between Compiler and Interpreter +###1.1.4 Hardware Compilation +##1.2 Compiler Design +###1.2.1 One-Pass vs Multi-Pass +####1.2.1.1 One Pass +#####1.2.1.1.1 Simple to Implement +#####1.2.1.1.2 Limited Optimization +####1.2.1.2 Multi-Pass +#####1.2.1.2.1 Enhanced Optimization +#####1.2.1.2.2 Easier to Prove Correctability +#####1.2.1.2.3 Source-to-Source Compilation Possible (Translators) +#####1.2.1.2.4 Source-Bytecode-Native Code +###1.2.2 Structure +#####1.2.2.1 Front End +#####1.2.2.1.1 Create Intermediate Representation +#####1.2.2.1.2 Manages Symbol Table +#####1.2.2.1.3 Steps +######1.2.2.1.3.1 Preprocessing +######1.2.2.1.3.2 Lexical Analysis +######1.2.2.1.3.3 Syntax Analysis +######1.2.2.1.3.4 Semantic Analysis +####1.2.2.2 Back End +#####1.2.2.2.1 Steps +######1.2.2.2.1.1 Analysis +######1.2.2.2.1.2 Optimization +######1.2.2.2.1.3 Code Generation ### What is a compiler? -Lexical Analysis -================ -### What is a regular language? +#Lexical Analysis + +##2.1 Grammars [Regular expressions](#what-is-a-regular-expression) define the regular languages. [Regular grammars](#what-is-a-regular-grammar) and [finite automata](#what-is-a-finite-automaton) recognize regular languages. +###2.1.1 Defined in Language Specification +###2.1.2 Tokens and Lexemes +####2.1.2.1 Defined in Specification +####2.1.2.2 Described Set of Valid Character Sequences +##2.2 Components +###2.2.1 Tokens +####2.2.1.1 Structured Text +####2.2.1.2 Categorized +####2.2.1.3 Example +#####2.2.1.3.1 int x = 3; +#####2.2.1.3.2 Tokens +######2.2.1.3.2.1 int (variable type) +######2.2.1.3.2.2 x (variable) +######2.2.1.3.2.3 = (operator) +######2.2.1.3.2.4 3 (value) +###2.2.2 Tokenizer +###2.2.3 Scanner +####2.2.3.1 Finite State Machine +####2.2.3.2 Contains Information What Constitutes a Valid Token +###2.2.4 Evaluator +####2.2.4.1 Works with Lexemes +####2.2.4.2 Produces a Value + #### Follow-up questions - [What is a regular expression](#what-is-a-regular-expression)? - [How can you tell if a language is regular](#how-can-you-tell-if-a-language-is-regular)? diff --git a/textbook/03-parsing.md b/textbook/03-parsing.md index fca4a68..c28b33d 100644 --- a/textbook/03-parsing.md +++ b/textbook/03-parsing.md @@ -1,6 +1,3 @@ - -\pagebreak - >TODO : The next step of the compilation process is parsing. Parsing takes input from the Lexical Analysis step and builds a parse tree, which will be used in future steps to develop the machine code. In this unit, we will define parsing and identify its uses. @@ -54,71 +51,77 @@ TOPICS: 3.3.2.2 Combine Terminal Symbol to Produce Nonterminals --> -Parsing -======= - -### 3.1 Parsing Overview -Syntax Analysis also known as parsing is the process of analyzing tokens and -recombining them into a syntax tree. +#Parsing +## 3.1 Parsing Overview +The concept of parsing has been around since the advent of written language millenia ago. More formally known as syntactic analysis, parsing is the process of analyzing tokens in order to +determine its grammatical structure. While it is used to diagram languages such as Latin, it also has extremely important implications for computing. +Compilers and interpreters use syntactic analysis to make sense of all programming languages, and without it there would be no internal representation of a given language. +While we take parsing for granted, it is a vital part of any computing system. -#### 3.1.1 Function -Syntax analysis will verify that the input`s syntax is valid. +### 3.1.1 Function +Syntactic analysis will verify that the input`s syntax is valid. +#### 3.1.1.1 Input: Tokens from Lexical Analysis +Lexical analysis splits input into tokens which the syntax analyzer then recombines into a syntax tree. -##### 3.1.1.1 Input: Tokens from Lexical Analysis -Lexical analysis splits input into tokens which the syntax analyzer then -recombines into a syntax tree. +#### 3.1.1.2 Output: Program Parse Tree +Recombining of a syntax parse tree during lexical analysis is done according to the syntax specification. +The leaves of the parse tree are the tokens generated during lexical analysis. -##### 3.1.1.2 Output: Program Parse Tree -Recombining of a syntax parse tree during lexical analysis is done according to -the syntax specification. -The leaves of the parse tree are the tokens generated -during lexical analysis. +### 3.1.2 Examples -#### 3.1.2 Examples +#### 3.1.2.1 Given an Arbitrary Function -##### 3.1.2.1 Given an Arbitrary Function +#### 3.1.2.2 Produce: -##### 3.1.2.2 Produce: +##### 3.1.2.2.1 Parser Input -###### 3.1.2.2.1 Parser Input +##### 3.1.2.2.2 Parse Tree -###### 3.1.2.2.2 Parse Tree +### 3.1.3 Context-Free Grammar -#### 3.1.3 Context-Free Grammar +## 3.2 Top-Down Parsing +A parser can determine the derived input of a language in two ways. The first of these ways is known as top-down parsing. +In top-down parsing, tokens are read from left to right. The parse tree is traversed from the highest level (top-down). -### 3.2 Top-Down Parsing +### 3.2.1 Traversing a Parse Tree +Not surprisingly, the parse tree is traversed from the top down in top-down parsing. -#### 3.2.1 Traversing a Parse Tree +#### 3.2.1.1 Definition -##### 3.2.1.1 Definition +#### 3.2.1.2 Example +LL parsers are examples of top-down parsing (diagram here?) -##### 3.2.1.2 Example +### 3.2.2 Backus-Naur Form Production Rules -#### 3.2.2 Backus-Naur Form Production Rules +### 3.2.3 LL Parser -#### 3.2.3 LL Parser +### 3.2.4 Process -#### 3.2.4 Process +#### 3.2.4.1 Starts at Left-most Symbol Yielded from Production Rule -##### 3.2.4.1 Starts at Left-most Symbol Yielded from Production Rule +#### 3.2.4.2 Continues to Next Production Rule for Each Non-Terminal Symbol -##### 3.2.4.2 Continues to Next Production Rule for Each Non-Terminal Symbol +#### 3.2.4.3 Proceeds "Down" the Parse Tree -##### 3.2.4.3 Proceeds "Down" the Parse Tree +## 3.3 Bottom-Up ### 3.3.1 Bottom-Up Parsing +The second method of parsing is known as bottom-up parsing. Using this method, a parser begins with the input and makes +an attempt to identify the simplest elements of the language by working backwards. -##### 3.3.1.1 Definition - -##### 3.3.1.2 Examplel +#### 3.3.1.1 Definition +#### 3.3.1.2 Examplel +LR parsers are examples of bottom-up parsing. (diagram here?) #### 3.3.2 Process -##### 3.3.2.1 Identify Terminal Symbols First +#### 3.3.2.1 Identify Terminal Symbols First + +#### 3.3.2.2 Combine Terminal Symbol to Produce Nonterminals + -##### 3.3.2.2 Combine Terminal Symbol to Produce Nonterminals ### What is a context-free language? diff --git a/textbook/04-ast-and-symbol-tables.md b/textbook/04-ast-and-symbol-tables.md index dae59ce..abf6267 100644 --- a/textbook/04-ast-and-symbol-tables.md +++ b/textbook/04-ast-and-symbol-tables.md @@ -1,6 +1,3 @@ - -\pagebreak - +#Ast and Symbol Tables + +##5.2 Symbols +###5.2.1 Definition +###5.2.2 Symbol Table +####5.2.2.1 Gives Information about an Identifier +#####5.2.2.1.1 Declaration Information +#####5.2.2.1.2 Scope +#####5.2.2.1.3 Type +#####5.2.2.1.4 Memory Address +####5.2.2.2 Implemented as a Hash Table +####5.2.2.3 Contained within the Object File +#####5.2.2.3.1 Used by Linker to Resolve References +#####5.2.2.3.2 Kept in Object Files for Debug Builds Abstract Syntax Trees and Symbol Tables ======================================= diff --git a/textbook/05-semantic-analysis.md b/textbook/05-semantic-analysis.md index e3d767f..c19d08d 100644 --- a/textbook/05-semantic-analysis.md +++ b/textbook/05-semantic-analysis.md @@ -1,6 +1,3 @@ - -\pagebreak - - - -Semantic Analysis -================= +#Semantic Analysis +##4.1 Overview +###4.1.1 Relation to Parse Tree +####4.1.1.1 Input from Parser +####4.1.1.2 Adds Semantic Information to Parse Tree +###4.1.2 Output to Code Generation Phase +##4.2 Process +###4.2.1 Type Checking +####4.2.1.1 Verify Type Constraints +####4.2.1.2 Static Checking +#####4.2.1.2.1 Done at Compile Time +#####4.2.1.2.2 Dynamic Checking Done at Runtime +#####4.2.1.2.3 Example Languages +######4.2.1.2.3.1 Ada +######4.2.1.2.3.2 C++ +######4.2.1.2.3.3 Java +####4.2.1.3 Type Safety +####4.2.1.4 Types Specified by the Language Specification +###4.2.2 Object Binding +####4.2.2.1 Associates Variable with its Definition +####4.2.2.2 Resolve Object References +###4.2.3 Assignment Operations +####4.2.3.1 Data Flow Analysis +####4.2.3.2 Definite Assignment Analysis +#####4.2.3.2.1 Ensures Variable are Assigned Before Used +#####4.2.3.2.2 Allows Potential Optimization +###4.2.4 Produce Errors/Warnings +##4.3 Time/Space Complexity ### What is semantics? +#Intermediate Representation +##5.1 Types +###5.1.1 Types of Types +####5.1.1.1 Primitive +####5.1.1.2 Reference +####5.1.1.3 Null +####5.1.1.4 Object +####5.1.1.5 Function +###5.1.2 Type Checking +####5.1.2.1 Static Typing +####5.1.2.2 Dynamic Typing +####5.1.2.3 Strong Typing +####5.1.2.4 Weak Typing +##5.3 Runtime Organization +###5.3.1 Storage +####5.3.1.1 Allocation +#####5.3.1.1.1 Static +#####5.3.1.1.2 Dynamic +####5.3.1.2 Local references +####5.3.1.3 Global References +###5.3.2 Runtime +####5.3.2.1 Debugging vs Release +####5.3.2.2 Runtime Exceptions ###### Types diff --git a/textbook/07-optimization.md b/textbook/07-optimization.md index f3a94c6..6415301 100644 --- a/textbook/07-optimization.md +++ b/textbook/07-optimization.md @@ -1,4 +1,3 @@ - \pagebreak +#Code generation -Code generation -=============== -### What is code generation? +##8.1 Overview +Code generation is the final compiler phase. It produces code in the target language, which is typically a machine language (e.g., x86, arm), but may be assembly or even a high-level language. + +The code generator is distinct from the parser and the translator. Code generation is the final [compiler phase](#what-are-the-phases-of-a-compiler). It produces code in the target language, which is typically a machine language (e.g., x86, arm), but may be assembly or even a high-level language. @@ -71,3 +74,47 @@ The code generator is distinct from the [parser](#what-is-a-parser) and the [tra Code generators try to optimize the generated code by doing several different things including using faster instructions, using fewer instructions, exploit available registers, and avoid redundant computations. +###8.1.1 Produces Machine-Executable Code +###8.1.2 Input Parse Tree +###8.1.3 Output Machine Code +###8.1.4 Includes Some Optimization Techniques +##8.2 Process +###8.2.1 Instruction Selection +####8.2.1.1 Transforms Middle-Level IR to Low-Level IR +#####8.2.1.1.1 Middle Level IR +#####8.2.1.1.1.1 Tree-Based +#####8.2.1.1.1.2 Intermediate Representation +#####8.2.1.1.2 Low Level IR +#####8.2.1.1.2.1 Reduced From Tree +#####8.2.1.1.2.2 Close to Target Language (Machine Code) +####8.2.1.2 Templates and Tiles +#####8.2.1.2.1 Tiles +#####8.2.1.2.1.1 Template That Matches a Portion of IR Tree +#####8.2.1.2.1.2 Implemented with a Single Target Instruction +#####8.2.1.2.2 Templates +#####8.2.1.2.2.1 Convert Code from IR to Target Language +#####8.2.1.2.2.2 Open to Optimization +#####8.2.1.2.3 Implementation +#####8.2.1.2.3.1 Backward Dynamic Programming +#####8.2.1.2.3.2 Greedy Algorithms +###8.2.2 Instruction Scheduling +####8.2.2.1 Optimization Technique +#####8.2.2.1.1 Reorders Instructions for Optimal Processing +#####8.2.2.1.2 Avoid Data Stalls and Code Structure Hazards +####8.2.2.2 Types of Scheduling Algorithms +###8.2.3 Register Allocation +####8.2.3.1 Multiplexes Program Variables to CPU Registers +#####8.2.3.1.1 Maximize Program Execution Time +#####8.2.3.1.2 Occurrences +#####8.2.3.1.2.1 Local +#####8.2.3.1.2.2 Global +#####8.2.3.1.2.3 Interprocedural +####8.2.3.2 NP-Complete Optimization Problem +###8.2.4 Non-Standard Compilers +####8.2.4.1 Just-In-Time Compilation +####8.2.4.2 Profiling + + +>>>>>>> 834e59cdc23433988ce06729e5c192b9e30bc2c5 +======= +>>>>>>> 6e5fd0d4c08f37c452ae8d1647391a2575390bc2