This is a heads-up about some important changes we (Sony) are making to the Instruction-creation APIs; the work for this was done by @jmorse, and the PSA below was written by him, but he’s on holiday for a while so I’ll be handling the attempt to land the work:
Hi all,
FYI, we’re planning on removing a whole host of Instruction constructors and Create methods that accept Instruction pointers, in favour of taking an iterator. This is likely to break a lot of downstream code in a way that seems spurious, so I thought I’d explain why it’s worth the disruption.
Briefly: all subclasses of Instruction have two flavours of constructors and Create methods right now, such as:
static BinaryOperator *CreateNot(Value *Op, const Twine &Name = "", Instruction *InsertBefore = nullptr);
static BinaryOperator *CreateNot(Value *Op, const Twine &Name, BasicBlock *InsertAtEnd);
These take a block to insert into, or an instruction. They all have a defaultable-to-nullptr flavour that doesn’t insert anywhere too. We’ve recently added versions of each that take a BasicBlock::iterator instead of an Instruction pointer and converted almost all of LLVM to use the iterator-flavour if needed. The plan is to soon (weeks) delete the Instruction pointer taking flavour (and let the block-taking version default to nullptr):
static BinaryOperator *CreateNot(Value *Op, const Twine &Name = "", BasicBlock *InsertAtEnd = nullptr);
static BinaryOperator *CreateNot(Value *Op, const Twine &Name, BasicBlock::iterator InsertBefore);
We’re doing this so that the correctness of debug-info updating is in the type system. As discussed in my proposal last year [0], we can eliminate debug intrinsics and maintain debugging information without getting in the way of optimisation passes, so long as we can identify whether passes intend to insert something at the beginning of a block. That’s achieved by putting a single bit in BasicBlock::iterator to signal if it was created by getFirstInsertionPt / getFirstNonPHI, and ensuring passes use those iterators to insert instructions. Once this is done you’ll never have to see a dbg.value intrinsic again, and the type system will encourage the correct identification of the start of blocks.
We’ve already enabled intrinsic-less debug-info mode and observed compile-time speedups (it’s usually proportionate to the amount of inlining) and a large reduction in debug-info having an effect on codegen decisions in our downstream tests. These constructor-changes are the final step needed to guarantee correctness in the future (there will be subtle regressions and off-by-ones otherwise). Thank you for your understanding!
~
The patch taking away the relevant constructors can be found here [1] if you want to make the necessary changes ahead-of-time. 95% of the insertion sites require trivial changes that are basically just changing a spelling. When making changes to a call-site that inserts an instruction using an instruction pointer:
- If the insertion-position identified because of what the position does, i.e. it computes a particular value, it’s a particular kind of instruction, it’s a user of a particular value… then just call getIterator() on the insert-site instruction pointer.
- If the insertion-position is identified because of /where/ it is in the block, I would encourage using getIterator() and std::next / std::prev to find that position, but you could use getNextNode / getPrevNode and getIterator too without worry,
- For any situation where the position might have been sourced from BasicBlock::begin, or getFirstNonPHI, or getFirstInsertionPt, you need to store that iterator and use it for the insertion position.
[0] [RFC] Instruction API changes needed to eliminate debug intrinsics from IR
[1] [RemoveDIs][NFC] Remove Instruction constructors that insert prior to Instructions by SLTozer · Pull Request #85980 · llvm/llvm-project · GitHub