CS325 Compiler Design

Lecture 1 – October 4th, 2017

Missed…

Lecture 2 – October 6th, 2017

Part 1 Lexical Analyzer (Lexing)

Slides

Lecture 3 – October 9th, 2017

DFA to minimal DFA: Hopcroft’s algorithm

  1. Given DD states in the DFA partition the states into two groups – FF and SFS-F, the accepting states and the nonaccepting states. Denote this initial partitioning as P=FP = F, SFS-F.
  2. Let new partitioning Pnew=PP_{new} = P
    For each group GG in PnewP_{new}
    partition into subgroups such that two states ss and tt are in the same subgroup if and only if for all input symbols aa

DFA partitions

Part 2 Syntax Analyzer (Parsing)

Overview

  • Context free grammars (CFGs) – formal mechanism for

    • specifying the syntax of the source language and
    • a systematic method of determining membership in this formally specified language
    • we use context-free grammars to specify the grammatical structure of programming languages.
  • Algorithms for doing this:

    • Top-down parsing – LL(1)LL(1) and recursive descent parsers.
    • Bottom-up parsing – LR(1)LR(1) parsers, canonical LR(1)LR(1) parsers, LALR parser.

Regular expressions are not sufficiently powerful to represent context-free structures. It doesn’t understand recursion, for example.

Context Free Grammars (CFGs)

Formally

  • TT – A set of terminal symbols
  • NTNT – A set of nonterminals
  • SS – A designation of one of the nonterminals as the start symbol
  • PP – A set of productions, where each production consists of a nonterminal, called the head or left side of the production, an arrow and a sequence of terminals and/or nonterminal, called the body or right side of the production.

Context free grammar

Backus-Naur form (BNF)

<SheepNoise> ::= baa <SheepNoise>
               | baa
  • Nonterminal symbols wrapped in angle brackets <SheepNoise>
  • Terminal symbols underlined.
  • The symbol ::= means “derives,” and the symbol | means “also derives“

Ambiguous Grammars

Consider the grammar: EE+EEE(E)idE \rightarrow E + E | E \ast E | (E) | \text{id}

Now consider the input: id * id + id

This input can be derived in two different ways:

Ambiguous Grammars

Lecture 4 – October 12th, 2017

Lecture 5 – October 13th, 2017

Lecture 6 – October 16th, 2017

Lecture 7 – October 19th, 2017

Lecture 8 – October 20th, 2017

Lecture 9 – October 23rd, 2017