Optimisation isn’t actually optimisation; it doesn’t result in optimal code.
So what does it do?
- Preserve the meaning of the program
- On average improves the program (run time, space, program size)
- Is worth the cost of extended compile time + complex implementation.
Why compilers and not programmers?
- Programs should obviously be easy to understand and read for the programmer or another programmer. Optimisation is bound to reduce readability.
- The program might be written in a high level language that doesn’t actually make the low level access available necessary for optimisation (e.g. addressing).
- Some optimisations use target machine specific knowledge; imagine if the programmer had to write a different program for every target machine!
Some optimisations are required in order to do other optimisations (uhuh…). Some of this is quite obvious.
Constant Propagation: If the value of a temporary is known to be constant at an instruction, then replace the temporary with the constant:
t <- 42 == t <- 42 x <- x + t x <- x + 42
Copy Propagation: Similar to the above. But bypassing a copy
t <- y == t <- y x <- x + t x <- x + y
Loop Unrolling: Put two or more copies of a loop body in a row. This seems a bit odd, but it’ll probably make sense later:
L1: x <- x + 1 L1: if x >= 10 goto L2 x <- x + 1 == x <- x + 1 if x < 10 goto L1 if x < 10 goto L1 L2:
Eliminate Unneessary Computation
Again, this is just fancy names for intuitive optimisation. Some of these a smart programmer would do anyway.
Constant folding: If the operands of an instruction are constant, do the computation at compile time.
t1 <- 4 + 200 == t1 <- 204
Forward Substitution: Something like this:
t1 <- fp - 20 == t1 <- fp - 16 t1 <- t1 + 4
Apply Arithmetic Identities: Get rid of stupid things like:
t1 <- t2 * 1 == t1 <- t2
t1 <- t2 || True == t1 <- True
Partial Evaluation/Program Specialisation: This is a trickier one. Basically at compile time partially evaluate the possibilities of a function. For example if you have a function that takes two numbers, and constructs an array based on one of those numbers (then does other stuff) you can partially evaluate that into instances of the function with the array constructed already.
Dead Code Elimination: Eliminate code which can never be reached, eliminate instructions which compute values that are never used. Reminds me of dead code analysis; a bug-checking method that involves marking every line that is called in a program, then looking at the result and trying to get the program to run the fringe cases that cause the other lines to run. This helps find hidden bugs that aren’t always immediately present in normal operation because the code doesn’t normally get run.
Eliminate Array Bound Checks: Often a compiler can do this.
Eliminate Induction Variable: This is a bit complicated. Often in a for loop you have a variable that is incremented or decremented, but then some other variable that is set in relation to that variable. So why not just set that in the for loop’s parameters?
Common Subexpression Elimination: If a subexpression is run more than once, just run it once and save the value. This is a prime example of an optimisation a programmer might perform too.
- Hoisting Loop-Invariant Computations: a form of Code Motion optimisation. Move instructions that are performed in a loop repeatedly but don’t actually depend on the loop, to outside of the loop (technically known as ‘loop-invariant values’). It looks like this can get quite complicated according to the slides (slides 21+ in the Optimisation slides).
- Lazy Code Motion: Another form of Code Motion optimisation. Move all instructions as far up in the control flow graph as possible, exposing the maximum number of redundancies (somehow…). Then instructions are pushed as far down as possible, which minimises register lifetimes (uhuh). No idea what this means…
- Reduction in Strength: Replacing an expensive instruction by an equivalent cheaper one. For example
x * 4is the same as
x << 2which is cheaper.
- Procedure inlining: Replace the procedure call with the code of the procedure body. Saves having to call the procedure (depends on how costly that is I guess).
- Instruction Scheduling: Pipelining, and reordering instructions to facilitate pipelining where possible.
- Cache-aware data access: Align and order instructions and data for efficient mapping into cache, and take advantage of the cache when performing data structure traversals (don’t keep accessing memory). For example if you have a multidimensional array stored in rows, change any code that traverses by column to take advantage of memory fetches fetching the whole row at once.