A directed acyclic graph (DAG!) is a directed graph that contains no cycles. A rooted tree is a special kind of DAG and a DAG is a special kind of directed graph. For example, a DAG may be used to represent common subexpressions in an optimizing compiler.
+ + . . . . . . . . * () *<---| () . . . . . . | . . . . . . . . | . | a b f * a b | f | . . ^ v . . | | a b |--<---- Tree DAG
expression: a*b+f(a*b)
Example of Common Subexpression.
The common subexpression a*b need only be compiled once but its value can be used twice.
A DAG can be used to represent prerequisites in a university course, constraints on operations to be carried out in building construction, in fact an arbitrary partial-order `<'. An edge is drawn from a to b whenever a<b. A partial order `<' satisfies:
(i) transitivity, a<b and b<c implies a<c
(ii) non-reflexive, not(a < a)
Standard graph terminology is used throughout this section. In addition,
for a directed edge e we denote by src(e) and dst(e) the source vertex and
the destination vertex of e.
‘Optimizations’ of Basic Blocks
Equivalent transformations: Two basic block are equivalent if they compute the same set of expressions.
-Expressions: are the values of the live variables at the exit of the block.
Two important classes of local transformations:
1:structure preserving transformations:
*common sub expression elimination
*dead code elimination
*renaming of temporary variables
*interchange of two independent adjacent statements.
2:algebraic transformations (countlessly many):
*simplify expressions
*replace expensive operations with cheaper ones.
The DAG Representation of Basic Blocks
Directed acyclic graphs (DAGs) give a picture of how the value computed by each statement in the basic block is used in the subsequent statements of the block.
Definition: a dag for a basic block is a directed acyclic graph with the following labels on nodes:
leaves are labeled with either variable names or constants.
they are unique identifiers
from operators we determine whether l- or r-value.
represent initial values of names. Subscript with 0.
interior nodes are labeled by an operator symbol.
Nodes are also (optionally) given a sequence of identifiers for labels.
- interior node = computed values
- identifiers in the sequence – have that value.
Algorithm DAG: Constructing a DAG
Input: A basic block of three address statements. No pointers or array
references.
Output:A DAG where each node n has a value, VALUE(n), which is an
operator in the case of an interior node or a variable name if the node is a
leaf. Also, each node n has a (possibly empty) list of identifiers attached,
ID(n).
Method:
begin
for each instruction I in the basic block do
if I is of form (x := y op z) then
Find a node, ny, such that y ε ID(ny) (only
one can exist). If it cannot be found, create a
leaf with VALUE(ny)=ID(ny)={y}.
... same for z, node is nz ...
Find or, if not found, create node m such that
VALUE(m) = op and ny and nz are
resp. its left and rignt child.
if there is p such that x ε ID(p) then
ID(p) := ID(p) - {x}
fi
ID(m) := ID(m) + {x}
elseif I is of form ( x := op y) then
... similar code as above...
elseif I is of form (x := y) then
Find a node, ny, such that y ε ID(ny) (only
one can exist). If it cannot be found, create a
leaf with VALUE(ny)=ID(ny)={y}.
m := ny
if there is p such that x ε ID(p) then
ID(p) := ID(p) - {x}
fi
ID(m) := ID(m) + {x}
fi
od
end
With the DAG it is easy to determine which identifiers have their values
used in the block. These are the identifiers for which a leaf is created.
Also, it is easy to determine the statements that compute values that
could be used outside the block. These are the statements whose
associated node, m, still has its left hand side, x, in ID(m) at the end of
the algorithm.
To improve the chances of finding common subexpressions,
commutative operations should be normalized. For example,when both
operands are variables, alphabetical order could be used. Also, if one of
the operands is a leaf and the other an internal node, the leaf could be
placed on the left.
Constant folding can be applied by replacing the nodes that evaluate to a
constant, say c, with a node m such that VALUE(m) is c.
The previous algorithm was introduced by Cocke and Schwartz and is
known as “Value Numbering of Basic Blocks”.
When there are references to array elements and to pointers, we need to
make sure that:
1. Common subexpressions are properly identified
2. The order of instructions generated from the DAG is correct.
To make sure that common subexpressions are correctly identified an
extra bit is added to each node. Every time there is an assignment to an
element of an array, all nodes representing elements of that array are
killed by setting the bit. The ID of a killed node cannot be extended with
new variable names. That is, they cannot be recognized as common
subexpressions.
Also, when there is an assignment of the form *p :=a all nodes in the
DAG must be killed if we do not know what p might point to.
Similar observations could be made about formal parameters when the
language allows aliasing.
To guarantee correct order of generated instructions, the DAG could be
extended with arcs that enforce the following rules:
1. Any two references to an array one of which is a write, must be
performed in the original order.
2. Any two references, if one is a write and at least one of the references
is through a pointer must be performed in the original order
No comments:
Post a Comment