Program Dependence Graph

Dear Community,

It’s really amazing to see the progress of MLIR! I have been following some topics of MLIR, but am not aware of all topics. I recently thought of a transformation using MLIR that requires the use of “Program Dependence Graph”. I wonder if the current version of MLIR supports this, or someone looking into it, or in the planning phase.

I noticed the presence of Affine/LinAlg/MemRef Dependence analysis, but the transformation that I plan to consider would be at a low-level abstraction, for, e.g., SCF/STD or even lower level. Any thoughts on how I can approach this?

Ref:

  1. Dependence Graphs in LLVM — LLVM 13 documentation
  2. https://www.cs.utexas.edu/~pingali/CS395T/2009fa/papers/ferrante87.pdf

The “Program dependence graph” from the late eighties/early nineties proposed yet another structure that combined control flow graphs and data dependences — that was all before SSA-based compilers and more powerful dependence notations became mainstream. Have you instead evaluated how the transformations you have in mind could be implemented using MLIR’s existing abstractions and infrastructure (or a subset of them), i.e., SSA, dominance info, branch op interface, other higher-order ops (for, ifs) or affine dependence information? In what way would the existing abstractions make it harder for your transformation?

Glad to hear from you Uday!

I’m exploring different dependence graph abstractions to answer fundamentally this question – For an operation that is writing into memory via memref, who is the “immediate/next” operation that is going to read from that memref (an example is in the bottom)? These operations are in a lower-level abstraction compared to Affine and hence not able to leverage MemRefDependenceGraph used for LoopFusion. (Also, MemRefDependenceGraph is yet to perform the loop-depth wise dependence analysis).

However, I thought of constructing all the dependencies (including transitive) by considering for every pair of load/stores and filter the transitive ones from the entire set. But, I’m realizing that this is becoming tricky especially in presence of loop nests and also not efficient w.r.t to compilation time overhead.

The other possibility is to walk through the block of the given operation, then traverse upwards to see if there are any blocks/for/if nodes and keep tracking of the closest ones – I felt this is very ad-hoc and complex. I also slowly started to realize that Program Dependence Graphs may not be straightforward to answer this question – Maybe multi-dimensional timestamps from the polyhedral model?

Example:
S1; // Store
for(i-loop) {
S2 //Load
for(j-loop) {
S3: //Load
S4: //Store
}
S5: //Load
}

Who is the immediate/next statement that is going to read memref after being written by S4? The answer is either S3 or S5, but not S2.

Thanks,
Prasanth

Not that it solves your problem, but the -affine-scalrep pass has to deal with similar problems to perform its optimizations (forwarding a store value to the loads that read from it and also eliminate redundant loads). The pass uses a combination of dominance info and affine memref dependence information. The code comments therein may be useful.

Memref dependence info can already be computed based on loop depths. I think you are only looking at the thing being used by the fusion pass. However, see for example test/lib/Analysis/TestMemRefDependenceCheck.cpp:

 srcOpInst->emitRemark("dependence from ")
            << i << " to " << j << " at depth " << d << " = "
            << getDirectionVectorStr(ret, numCommonLoops, d,
                                     dependenceComponents);
      }

or the -affine-scalrep pass where a dependence is checked at specific/desired depths between a store/load pair.

Thanks, Uday for pointing me to the -affine-scalrep pass. Using the knowledge from this pass, I’m currently exploring an alternative analysis that can identify the immediate/first read for a given store (both SSA value and memref), even if the reads/writes are not part of the dominators. I feel that this analysis can enable more opportunities for store propagation and other optimizations too.

Thanks for correcting me on the dependence depths – I was looking at fusion pass.