Hi, I am having some interesting issues when running llc
on a piece of auto-generated code, namely that the machine block placement pass is taking an excessive amount of time.
For one file with a rather large function (a threaded interpreter core with indirect br instructions) and a lot of small helper functions (say 3-4k of them) I get the following timings:
llc -O1
1616.9128 ( 42.6%) 0.4521 ( 22.5%) 1617.3649 ( 42.6%) 1617.7932 ( 42.6%) Branch Probability Basic Block Placement
llc -O2
1626.9825 ( 42.1%) 0.3805 ( 13.4%) 1627.3631 ( 42.0%) 1627.5726 ( 42.0%) Branch Probability Basic Block Placement
llc -O3
3322.8112 ( 60.2%) 0.8212 ( 35.7%) 3323.6324 ( 60.2%) 3324.1152 ( 60.2%) Branch Probability Basic Block Placement
That is almost an hour for compiling with -O3, but I have a larger interpreter with more helper functions, and that was clocking in at 10h in compile time, but I am not going to rerun that test right now .
So, what I am wondering if there is a description somewhere about how the pass works or if someone can explain it quickly, and maybe what the approaches to controlling the time it takes would be? This would help me fix our code generator.