Which passes are executed by different mlir-cpu-runner opt-level?

Hi,

I want to know which passes are executed by different mlir-cpu-runner opt-level (--O0, --O1, --O2, --O3). And I use mlir-cpu-runner --O3 --debug-pass=Arguments ... to print the passes:

Pass Arguments:  -tti -tbaa -scoped-noalias-aa -assumption-cache-tracker -targetlibinfo -coro-early -simplifycfg -domtree -sroa -early-cse -lower-expect
Pass Arguments:  -tti -tbaa -scoped-noalias-aa -targetlibinfo -assumption-cache-tracker -profile-summary-info -annotation2metadata -forceattrs -inferattrs -domtree -callsite-splitting -ipsccp -called-value-propagation -globalopt -domtree -mem2reg -deadargelim -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -basiccg -globals-aa -prune-eh -inline -openmpopt -function-attrs -argpromotion -coro-split -domtree -sroa -basic-aa -aa -memoryssa -early-cse-memssa -speculative-execution -aa -lazy-value-info -jump-threading -correlated-propagation -simplifycfg -domtree -aggressive-instcombine -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -libcalls-shrinkwrap -loops -postdomtree -branch-prob -block-freq -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -pgo-memop-opt -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -tailcallelim -simplifycfg -reassociate -domtree -loops -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -loop-rotate -memoryssa -lazy-branch-prob -lazy-block-freq -licm -loop-unswitch -simplifycfg -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-idiom -indvars -loop-deletion -loop-unroll -sroa -aa -mldst-motion -phi-values -aa -memdep -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -gvn -sccp -demanded-bits -bdce -basic-aa -aa -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -lazy-value-info -jump-threading -correlated-propagation -postdomtree -adce -basic-aa -aa -memoryssa -memcpyopt -dse -loops -loop-simplify -lcssa-verification -lcssa -aa -scalar-evolution -lazy-branch-prob -lazy-block-freq -licm -coro-elide -simplifycfg -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -barrier -elim-avail-extern -basiccg -rpo-function-attrs -globalopt -globaldce -basiccg -globals-aa -domtree -float2int -lower-constant-intrinsics -loops -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -loop-rotate -loop-accesses -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -loop-distribute -postdomtree -branch-prob -block-freq -scalar-evolution -basic-aa -aa -loop-accesses -demanded-bits -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -inject-tli-mappings -loop-vectorize -loop-simplify -scalar-evolution -aa -loop-accesses -lazy-branch-prob -lazy-block-freq -loop-load-elim -basic-aa -aa -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -domtree -loops -scalar-evolution -basic-aa -aa -demanded-bits -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -inject-tli-mappings -slp-vectorizer -vector-combine -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-unroll -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -memoryssa -loop-simplify -lcssa-verification -lcssa -scalar-evolution -lazy-branch-prob -lazy-block-freq -licm -opt-remark-emitter -transform-warning -alignment-from-assumptions -strip-dead-prototypes -globaldce -constmerge -cg-profile -domtree -loops -postdomtree -branch-prob -block-freq -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -block-freq -loop-sink -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instsimplify -div-rem-pairs -simplifycfg -coro-cleanup -annotation-remarks
Pass Arguments:  -domtree
Pass Arguments:  -targetlibinfo -domtree -loops -postdomtree -branch-prob -block-freq
Pass Arguments:  -targetlibinfo -domtree -loops -postdomtree -branch-prob -block-freq
Pass Arguments:  -targetlibinfo -domtree -loops -lazy-branch-prob -lazy-block-freq
Pass Arguments:  -targetpassconfig -machinemoduleinfo -tti -tbaa -scoped-noalias-aa -assumption-cache-tracker -targetlibinfo -profile-summary-info -collector-metadata -machine-branch-prob -loweremutls -pre-isel-intrinsic-lowering -atomic-expand -lower-amx-intrinsics -lower-amx-type -domtree -basic-aa -loops -loop-simplify -scalar-evolution -canon-freeze -iv-users -loop-reduce -basic-aa -aa -mergeicmps -loops -lazy-branch-prob -lazy-block-freq -expandmemcmp -gc-lowering -shadow-stack-gc-lowering -lower-constant-intrinsics -unreachableblockelim -loops -postdomtree -branch-prob -block-freq -consthoist -replace-with-veclib -partially-inline-libcalls -scalarize-masked-mem-intrin -expand-reductions -interleaved-access -x86-partial-reduction -indirectbr-expand -loops -codegenprepare -rewrite-symbols -domtree -dwarfehprepare -safe-stack -stack-protector -basic-aa -aa -loops -postdomtree -branch-prob -lazy-branch-prob -lazy-block-freq -machinedomtree -finalize-isel -x86-domain-reassignment -lazy-machine-block-freq -early-tailduplication -opt-phis -slotindexes -stack-coloring -localstackalloc -dead-mi-elimination -machinedomtree -machine-loops -machine-trace-metrics -early-ifcvt -lazy-machine-block-freq -machine-combiner -x86-cmov-conversion -machinedomtree -machine-loops -machine-block-freq -early-machinelicm -machinedomtree -machine-block-freq -machine-cse -machinepostdomtree -machine-sink -peephole-opt -dead-mi-elimination -lrshrink -x86-fixup-setcc -lazy-machine-block-freq -x86-optimize-LEAs -x86-cf-opt -x86-avoid-SFB -x86-slh -machinedomtree -x86-flags-copy-lowering -machinedomtree -detect-dead-lanes -processimpdefs -unreachable-mbb-elimination -livevars -machine-loops -phi-node-elimination -twoaddressinstruction -slotindexes -liveintervals -simple-register-coalescing -rename-independent-subregs -machine-scheduler -machine-block-freq -livedebugvars -livestacks -virtregmap -liveregmatrix -edge-bundles -spill-code-placement -lazy-machine-block-freq -machine-opt-remark-emitter -greedy -tileconfig -virtregrewriter -stack-slot-coloring -machine-cp -machinelicm -lowertilecopy -edge-bundles -x86-codegen -machinedomtree -machine-domfrontier -x86-lvi-load -fixup-statepoint-caller-saved -postra-machine-sink -machine-block-freq -machinepostdomtree -lazy-machine-block-freq -machine-opt-remark-emitter -shrink-wrap -prologepilog -branch-folder -lazy-machine-block-freq -tailduplication -machine-cp -postrapseudos -x86-pseudo -machinedomtree -machine-loops -post-RA-sched -gc-analysis -machine-block-freq -machinepostdomtree -block-placement -fentry-insert -xray-instrumentation -patchable-function -reaching-deps-analysis -x86-execution-domain-fix -break-false-deps -machinedomtree -machine-loops -lazy-machine-block-freq -x86-fixup-bw-insts -lazy-machine-block-freq -x86-fixup-LEAs -x86-evex-to-vex-compress -funclet-layout -stackmap-liveness -livedebugvalues -x86-seses -cfi-instr-inserter -x86-lvi-ret -lazy-machine-block-freq -machine-opt-remark-emitter

But when I try to use these passes arguments, it seems that some arguments are illegal:

mlir-cpu-runner: Unknown command line argument '--pgo-memop-opt'.  Try: '../../../llvm/build/bin/mlir-cpu-runner --help'
mlir-cpu-runner: Did you mean '--memcpyopt'?
mlir-cpu-runner: Unknown command line argument '--cg-profile'.  Try: '../../../llvm/build/bin/mlir-cpu-runner --help'
mlir-cpu-runner: Did you mean '--sample-profile'?
mlir-cpu-runner: Unknown command line argument '--machinemoduleinfo'.  Try: '../../../llvm/build/bin/mlir-cpu-runner --help'
mlir-cpu-runner: Did you mean '--machinedomtree'?
mlir-cpu-runner: Unknown command line argument '--collector-metadata'.  Try: '../../../llvm/build/bin/mlir-cpu-runner --help'
mlir-cpu-runner: Did you mean '--annotation2metadata'?
mlir-cpu-runner: Unknown command line argument '--loweremutls'.  Try: '../../../llvm/build/bin/mlir-cpu-runner --help'
mlir-cpu-runner: Did you mean '--lowerswitch'?
mlir-cpu-runner: Unknown command line argument '--pre-isel-intrinsic-lowering'.  Try: '../../../llvm/build/bin/mlir-cpu-runner --help'
mlir-cpu-runner: Did you mean '--x86-flags-copy-lowering'?
... ...

I don’t know if it’s the right way to print opt-level passes by the --debug-pass=Arguments, if not, how to know which passes are executed by different mlir-cpu-runner opt-level?

Thanks!

Hongbin

All these are LLVM passes, you shouldn’t (and most of the time can’t) manually run them with mlir-cpu-runner, or any other MLIR tool. If you want to run these passes individually, convert MLIR to LLVM IR and call LLVM opt tool.

Here’s the code that populates LLVM’s pass manager - llvm-project/OptUtils.cpp at main · llvm/llvm-project · GitHub, you can dig from there to see which passes are added and under which conditions.

Thanks a lot!

I want to ask further about LLVM optimizing passes in mlir-cpu-runner. According to the mlir-cpu-runner --help, there is a list of LLVM passes:

LLVM optimizing passes to run
      --aa                                              - Function Alias Analysis Results
      --aa-eval                                         - Exhaustive Alias Analysis Precision Evaluator
      --adce                                            - Aggressive Dead Code Elimination
      --add-discriminators                              - Add DWARF path discriminators
      ... ...

I think if I use these LLVM arguments in mlir-cpu-runner, the passes will actually execute on the LLVM IR level, so whether I use these arguments in mlir-cpu-runner or opt, the final effect is the same, right? And I am also wondering why these LLVM passes can be used in mlir-cpu-runner, but others (those illegal arguments I showed above) are not added to mlir-cpu-runner?

The help message for mlir-cpu-runner is likely “lying” here: this is because LLVM has this tendency to register all the options globally and they show up even where they aren’t honored. We should look into this though.

After digging into a little bit of mlir-cpu-runner, I found that the mlir::initializeLLVMPasses() (llvm-project/OptUtils.cpp at main · llvm/llvm-project · GitHub) only initialized part of LLVM Passes (llvm-project/InitializePasses.h at main · llvm/llvm-project · GitHub). So I think those arguments are missing because mlir-cpu-runner didn’t register all the LLVM passes.

Just a question: doing mlir-cpu-runner -O3 is it equivalent with using llc -O3 in the manual code generation process? In other words, is the -O3 transferred to the llc command, or there is more to it (e.g. some mlir-opt level transformations that are enabled)?

From the function populatePassManagers Alex showed above, I don’t think mlir-opt level passes are executed by the -O3. But I am not sure about its equivalence with llc -O3.