Diving into the third item in this list of things that can be done to improve compiler performance, I’d like to discuss improving the efficiency of the getOperand
calls.
For background, Operation
is currently defined like this:
class alignas(8) Operation final
: public llvm::ilist_node_with_parent<Operation, Block>,
private llvm::TrailingObjects<Operation, BlockOperand, Region,
detail::OperandStorage>
The operands of an operation are stored as a consecutive array of OpOperand
values, which are allocated as part of the aptly named OperandStorage
object. The consequence of this is that very common calls like op->getOperand(0)
or op.lhs()
has to load the # block operands, # regions, and do a bunch of math to get the OpOperand
and the Value’s in question. It may also surprise you that things like: Value lhs = thing.lhs(), rhs = thing.rhs();
does all this arithmetic twice: it can’t be CSEd because it is large and out-of-line.
It is inherent to the design of Operation that some thing will need this arithmetic, but BlockOperand and Regions are much less common and commonly accessed than operands, so I’d like to see operands come first.
The challenge with this is that detail::OperandStorage
itself has trailing objects:
/// This class handles the management of operation operands. Operands are
/// stored either in a trailing array, or a dynamically resizable vector.
class OperandStorage final
: private llvm::TrailingObjects<OperandStorage, OpOperand> {
...
/// The current representation of the storage. This is either a
/// InlineOperandStorage, or a pointer to a InlineOperandStorage.
union {
TrailingOperandStorage inlineStorage;
llvm::PointerIntPair<TrailingOperandStorage *, 1, bool,
StoragePointerLikeTypeTraits>
dynamicStorage;
};
I think that can be addressed by refactoring this to flatten the operand list into Operation
itself, something akin to:
class alignas(8) Operation final
: public llvm::ilist_node_with_parent<Operation, Block>,
private llvm::TrailingObjects<Operation, OpOperand, BlockOperand, Region>
And then give Operation
a pointer to OpOperand for the out-of-line case, and have separate “number of inline operands” vs “number of operands” integers. Does this seem like a sensible approach to this?
-Chris