LLVM Discussion Forums

Lowering optional attributes in Linalg StructuredOps to Standard dialect

Hi,

I’m working on interfacing Linalg StructuredOps to our accelerated libraries on AMD platform. The infrastructure provided in StructuredOps make the effort relatively easy.

I do find one caveat though: In linalg.conv , OptionalAttr such as dilation / strides aren’t preserved converting to Standard dialect:

In linalg.copy, inputPermutation / outputPermutation aren’t preserved either.

I can sense some sort of ABI might be needed and would like to use this thread for discussion.

Jack Chung

Hello Jack,

Thank you for opening this discussion, it is exciting to see the community picking up some of the loose ends we didn’t have the resources to address on the spot.
As context, having strong support for interleaving codegen and library calls is on of the key design principles of the StructuredOps abstractions (of which Linalg is an implementation).

As you note, there is currently no support for passing attribute values to the library call itself. As the first official user of this feature you get to help drive it to ensure that it meets your particular needs.

Here are some deeper considerations re ABI when interfacing codegen with external pre-compiled functions.

Attribute Selection

Not all attributes should and will be passed: passes and transformations may introduce arbitrary attributes, we should not pick them all. This is on a op-by-op basis and will probably require “named” LinalgOps to declare a function that returns the known attribute names that need to be exported, in a fixed order to be able to match properly on the C++ side.

This will require multiple rounds of iteration to get right because of the “static” nature of the problem: each ABI change will require a change on the C++ side for the shim function. This may be possible to automate but I have not yet thought about it and do not have a proposal. Solving the problem generally may require drilling into dynamic vs static considerations similar to interfacing python with C++ which I think are premature. Of course if you have a general solution in mind please do propose.

Ranked vs Unranked Representation

A consequence of the above is that each combination of (op / operands ranks) may have its own type and we will need to decide between a ranked and unranked representation for array attributes; i.e.
int[2] dilations
or

int rank; 
int * dilations;

The ranked representation shifts burden to the C++ interop. (more shims to interface and structs to get right), the unranked representation shifts burden to the compiler.

This is also related to whether you use ranked or unranked descriptors for the buffers (memref): both buffers and attribute descriptors must be unranked if you want a single C++ shim for representing say conv_1d, …, conv_nd with a single C++ entry point.

I think it would be better to go the unranked route, but it will be more effort to get something working, so feel free to experiment with different designs. Conv is probably the best op to start from since it will exhibit most of the tradeoffs.

Now the key point: whatever happens, this is not allowed to leak to the “named” Linalg op definition and this is a hard constraint: if you find yourself thinking that you want to add just an extra attribute to encode an integer that will make it easier to implement all this please don’t and let’s discuss :slight_smile:.

Flattened Attribute Descriptor or Pointer to Attribute Descriptor

At the moment the buffers are passed by pointer to a data structure allocated with MLIR. This is because of deep issues related to alignment, packing, C layout and interop that MLIR has absolutely no support for and will not have for quite some time. The TL;DR is that there is a very significant effort at the clang + MLIR level to get these right.

As a consequence, the only 2 options today are to:

  1. Flatten all attributes and turn them into arguments to the C function. This must be recursive: all structs must be flattened to avoid layout issues (this is not 100% true but a good rule of thumb still…).
  2. Allocate a struct in the MLIR LLVM Dialect and pass a pointer to it to the C function. This bypasses the layout issues.

For your particular use case, 1. may work well enough. For buffers in general we use 2. because it would make the general case too impractical on the C++ side.

Note that if you go the 1. route, there may be issues if you want an unranked representation, we probably need to try, see what breaks and iterate.

Ongoing Linalg Evolution

Orthogonally to all this, I am in the process of automating the definition of Linalg ops in Tablegen. The gist is that many ops are just a particular configuration of the linalg.generic attributes + region with some extra attributes. This property is used everywhere in transformations.

This will evolve to work with the “Attribute Selection” above but will involve multiple iterations and we will likely need to synchronize a bit more around the boundaries.

That’s about what I can think of right now, if you need specific pointers to get started, you could look at how LLVM lowering of MemRef occurs and especially at the function boundaries. For now I have assumed that these are somewhat understood but I am happy to dig deeper.

In any case I am thrilled to see this moving , thank you for pushing this forward!

Also, @stellaraccident, @benvanik @antiagainst may additional have valuable insights to share on the topic.