A question about linalg.generic's input/output buffers' aliasing

Hello all,

I have a question about linalg.generic.

Is it allowed to write a loop whose in and out are equivalent memref values?

#map0 = affine_map<(d0, d1) -> (d0, d1)>
linalg.generic {
    indexing_maps = [#map0, #map0],
    iterator_types = ["parallel", "parallel"]}
  // in and out memrefs are equivalent
  ins(%mr: memref<30x131072xi64>) outs(%mr : memref<30x131072xi64>) {
^bb0(%inelem: i64, %outelem: i64):
  %v = arith.muli %inelem, %inelem : i64
  linalg.yield %v : i64
}     

I guess in and out buffers must not alias in general because it invalidates lowering linalg.generic to affine loops, but the above pattern is helpful for describing loops that are doing in-place updates.

1 Like

If you want to do inplace updates you can just leave the ins empty

#map0 = affine_map<(d0, d1) -> (d0, d1)>
linalg.generic {
    indexing_maps = [#map0],
    iterator_types = ["parallel", "parallel"]}
    outs(%mr : memref<30x131072xi64>) {
^bb0(%outelem: i64):
  %v = arith.muli %outelem, %outelem : i64
  linalg.yield %v : i64
}     

On memref types, the outs is always in-place updated.

1 Like

Oh, that’s great. Thank you!

We also use this in a dynamic fashion where we keep ins and outs but those memrefs might alias at runtime. As long as there are no interior conflicts in the generic that use case is covered by linalg semantics.

Hi @herhut,
Could you provide an example or details about the case, please?

The corresponding code is in https://cs.opensource.google/tensorflow/tensorflow/+/master:tensorflow/compiler/mlir/tools/kernel_gen/transforms/buffer_reuse_pass.cc.

In essence, we try to detect cases where a linalg.generic could be performed in place and then replace the allocation with the output by a dynamic reuse of the input. Dynamic here means that we at runtime check whether we hold the last reference to the input (reference count is 1) and then reuse it or otherwise allocate a fresh buffer.

1 Like