Coursify

Compiler Design (CD)

PYQ Analysis and Exam Preparation

50 mins

Comprehensive analysis of Previous Year Questions (PYQs) for Module 4. Master the construction of Quadruples and Triples tables, solve iterative Data Flow Analysis numericals, and review high-yield optimization theory.

Learning Goals

  • Identify frequently tested numerical patterns in Intermediate Code Generation and Optimization.
  • Practice converting complex arithmetic expressions into Quadruple, Triple, and Indirect Triple representations.
  • Execute the iterative worklist algorithm for Reaching Definitions and Live Variable analysis as required in 14-mark questions.
  • Review standard theoretical questions on loop optimizations and register allocation strategies.

Module Weightage & Focus Areas

Typical mark distribution across core optimization and generation topics.

Note: Data Flow Analysis often appears as the primary 14-mark numerical.

πŸ“Š Core Analysis & Exam Strategy

This module is a major source of numerical and analytical questions, typically accounting for a significant portion of the exam. The focus heavily leans toward table construction, iterative algorithms, and structured graph theory.

High-Yield Core Topics:

  1. Intermediate Representations: You will almost certainly be asked to convert mathematical expressions (e.g., a = (b + c) * (b + c) + d) into structured formats like Quadruples, Triples, and Indirect Triples.
  2. Data Flow Analysis: This represents the largest numerical problems. Expect to partition intermediate code into Basic Blocks, draw a Control Flow Graph (CFG), and solve for Reaching Definitions using iterative algorithms.
  3. Loop Optimization Theory: Comparing techniques like Code Motion, Strength Reduction, and Induction Variable Elimination.
  4. Register Allocation: Explaining Graph Coloring heuristics and the difference between Local and Global allocation.

Time Management Strategy for Compiler Design Exams

When tackling this module during an exam, prioritize execution order based on mechanical effort vs. weightage:

  1. High-Weight Numerical (Data Flow / Reaching Defs): Allocate ~25 mins. These are mechanical. Once your table and iterations are set up, partial marks are heavily assured even if final convergence isn't flawless.
  2. Mid-Weight Numerical (DAG / Triples): Allocate ~12 mins. Quick to construct. Draw the DAG cleanly and list the table directly.
  3. Theory Answers: Allocate ~10 mins. Outline your answer in 2 minutes, write for 6. Always include code/assembly examplesβ€”theory without examples rarely scores full marks.

Definition: A post-pass optimization that scans a small sliding window (2-5 instructions) in assembly code, replacing inefficient sequences with optimized ones.

Key Categories with Examples:

  1. Redundant Load/Store:
1STR R0, [addr] 2LDR R0, [addr] // ← Delete this (R0 already holds the value)
  1. Algebraic Simplification:
1ADD R0, R0, #0 // ← Delete (no-op) 2MUL R0, R0, #1 // ← Delete (no-op)
  1. Unreachable Code:
1B L1 2MOV R0, #5 // ← Delete (never reached) 3L1: ...

Numerical Walkthrough: Multi-Loop Partitioning & CFG

  1. 1
    Step 1

    Analyze the following intermediate code to partition it into basic blocks and draw the Control Flow Graph.

    1(1) i = 12 2(2) j = 1 3(3) t1 = 10 * i 4(4) t2 = t1 + j 5(5) t3 = 8 * t2 6(6) t4 = t3 - 88 7(7) a[t4] = 0.0 8(8) j = j + 1 9(9) if j <= 10 goto (3) 10(10) i = i + 1 11(11) if i <= 10 goto (2) 12(12) i = 1 13(13) t5 = i - 1 14(14) t6 = 88 * t5 15(15) a[t6] = 1.0 16(16) i = i + 1 17(17) if i <= 10 goto (13)
  2. 2
    Step 2

    Using standard compiler rules to find leaders:

    1. Statement 1: First instruction.
    2. Statement 3: Target of conditional jump goto (3) (line 9).
    3. Statement 2: Target of conditional jump goto (2) (line 11).
    4. Statement 13: Target of conditional jump goto (13) (line 17).
    5. Statement 10: Instruction immediately following the jump at line 9.
    6. Statement 12: Instruction immediately following the jump at line 11.

    Final Set of Leaders: {1, 2, 3, 10, 12, 13}

  3. 3
    Step 3

    Group instructions starting from a leader down to (but not including) the next leader.

    B1: {(1)} B2: {(2)} B3: {(3), (4), (5), (6), (7), (8), (9)} B4: {(10), (11)} B5: {(12)} B6: {(13), (14), (15), (16), (17)}

  4. 4
    Step 4

    Connect blocks based on natural sequential flow and explicit jump targets.

Numerical Walkthrough: Quadruples vs Triples Translation

  1. 1
    Step 1

    Problem: Represent the statement a = (b + c) * (b + c) + d using Three-Address Code, Quadruples, and Triples.

  2. 2
    Step 2

    Decompose the expression into fundamental operations.

    1t1 = b + c 2t2 = b + c 3t3 = t1 * t2 4t4 = t3 + d 5a = t4

    (Note: An optimizing compiler would realize t2 = t1 and reuse it, but for explicit translation matrices, map the literal operations unless instructed otherwise.)

  3. 3
    Step 3

    Quadruples require four fields: operator, argument 1, argument 2, and explicit result variable.

    #oparg1arg2result
    (0)+bct1
    (1)+bct2
    (2)*t1t2t3
    (3)+t3dt4
    (4)=t4a
  4. 4
    Step 4

    Triples omit the explicit result field. Instead, they reference previous computational steps using their positional index (in parentheses).

    #oparg1arg2
    (0)+bc
    (1)+bc
    (2)*(0)(1)
    (3)+(2)d
    (4)=a(3)

Numerical Walkthrough: Directed Acyclic Graph (DAG) Construction

  1. 1
    Step 1

    Problem: Construct a DAG for the following code, identify subexpressions, and generate optimized TAC.

    1(1) t1 = 4 * i 2(2) t2 = a[t1] 3(3) t3 = 4 * j 4(4) t4 = b[t3] 5(5) t5 = t2 * t4 6(6) t6 = prod + t5 7(7) prod = t6 8(8) t7 = i + 1 9(9) i = t7 10(10) if i <= 20 goto (1)
  2. 2
    Step 2

    Establish base constants and initial variable states.

    NodeTypeValue
    L1const4
    L2vari (initial)
    L3vara
    L4varj (initial)
    L5varb
    L6varprod (initial)
    L7const1
    L8const20
  3. 3
    Step 3
    • (1) t1 = 4 * i β†’ Create interior node I1 = *(L1, L2), label 't1'.
    • (2) t2 = a[t1] β†’ Create interior node I2 = =[](L3, I1), label 't2'.
    • (3) t3 = 4 * j β†’ Reuse constant L1. Create I3 = *(L1, L4), label 't3'.
    • (4) t4 = b[t3] β†’ Create I4 = =[](L5, I3), label 't4'.
    • (5) t5 = t2 * t4 β†’ Create I5 = *(I2, I4), label 't5'.
  4. 4
    Step 4
    • (6) t6 = prod + t5 β†’ Create I6 = +(L6, I5), label 't6'.
    • (7) prod = t6 β†’ This is a copy! Merge label 'prod' onto node I6. I6 now holds {t6, prod}.
    • (8) t7 = i + 1 β†’ Create I7 = +(L2, L7), label 't7'.
    • (9) i = t7 β†’ Another copy! Merge label 'i' onto I7. I7 holds {t7, i}.
    • (10) if i <= 20 β†’ Create relational node I8 = <=(I7, L8).
  5. 5
    Step 5
  6. 6
    Step 6

    Traversing the DAG interior nodes mathematically eliminates the redundant copy statements:

    1(1) t1 = 4 * i 2(2) t2 = a[t1] 3(3) t3 = 4 * j 4(4) t4 = b[t3] 5(5) t5 = t2 * t4 6(6) prod = prod + t5 7(7) i = i + 1 8(8) if i <= 20 goto (1)

    Optimization Results: Instruction count reduced from 10 to 8. Temporary variables reduced from 7 to 5. Two explicit copy operations entirely eliminated.

Numerical Walkthrough: Reaching Definitions Data Flow

  1. 1
    Step 1

    Problem: Compute Reaching Definitions using the iterative algorithm for the block below. Determine GEN, KILL, and show full state convergence.

    1(1) i = 1 2(2) j = 1 3(3) t1 = 10 * i 4(4) t2 = t1 + j 5(5) a[t2] = 0 6(6) j = j + 1 7(7) if j <= 10 goto (3) 8(8) i = i + 1 9(9) if i <= 10 goto (2)
  2. 2
    Step 2

    Basic Blocks identified by leaders:

    • B1: {(1)}
    • B2: {(2)}
    • B3: {(3), (4), (5), (6), (7)}
    • B4: {(8), (9)}
  3. 3
    Step 3

    Map every assignment to a definition tracker.

    IDBlockVariableIDBlockVariable
    d1B1id5B3a[t2]
    d2B2jd6B3j
    d3B3t1d7B4i
    d4B3t2
  4. 4
    Step 4

    A definition KILLs all other definitions of the same variable elsewhere in the program.

    BlockGEN[B]KILL[B]Reasoning
    B1{d1}{d7}d1 redefines i, killing B4's d7
    B2{d2}{d6}d2 redefines j, killing B3's d6
    B3{d3, d4, d5, d6}{d2}d6 redefines j, killing B2's d2
    B4{d7}{d1}d7 redefines i, killing B1's d1
  5. 5
    Step 5

    All IN sets begin as βˆ…. Formula: OUT = GEN U (IN - KILL)

    BlockINGENIN - KILLOUT
    B1βˆ…{d1}βˆ…{d1}
    B2βˆ…{d2}βˆ…{d2}
    B3βˆ…{d3,d4,d5,d6}βˆ…{d3,d4,d5,d6}
    B4βˆ…{d7}βˆ…{d7}
  6. 6
    Step 6

    Calculate IN sets based on predecessors. Example: IN[B2] = OUT[B1] U OUT[B4]

    BlockININ - KILLOUT
    B1βˆ…βˆ…{d1}
    B2{d1, d7}{d1, d7}{d1, d2, d7}
    B3{d2, d3, d4, d5, d6}{d3, d4, d5, d6}{d3, d4, d5, d6}
    B4{d3, d4, d5, d6}{d3, d4, d5, d6}{d3, d4, d5, d6, d7}
  7. 7
    Step 7

    Continue applying the union rule until states cease altering.

    BlockININ - KILLOUT (Iter 3)OUT (Iter 4)
    B1βˆ…βˆ…{d1}{d1} (Stable)
    B2{d1, d3, d4, d5, d6, d7}{d1, d3, d4, d5, d7}{d1, d2, d3, d4, d5, d7}Stable
    B3{d1, d2, d3, d4, d5, d6, d7}{d1, d3, d4, d5, d6, d7}{d1, d3, d4, d5, d6, d7}Stable
    B4{d1, d3, d4, d5, d6, d7}{d3, d4, d5, d6, d7}{d3, d4, d5, d6, d7}Stable

    Fixpoint Reached: Because Iteration 4's OUT matches Iteration 3, convergence is achieved.

Conceptual Foundations & Objective Fact Bank

Knowledge Check

Question 1 of 6
Q1Single choice

Which of the following is evaluated as a 'leader' when partitioning a basic block?