Bufferization error related to ```memref.clone```

dpotop · November 3, 2021, 4:00pm

Hello everybody,

We ran into some problems with some of our older code that no longer compiles, and it’s not clear to us if it’s a MLIR compiler bug, or just us not calling the magical optimization step at the good place.

The MLIR code fragment (simplified from a larger example) is the following:

func private @process(%i:tensor<512xf32>,%j:tensor<512xf32>)->(tensor<512xf32>,tensor<512xf32>)

func @myfun()->(tensor<512xf32>,tensor<512xf32>) {
  %1   = constant 1 : index
  %0   = constant 0 : index
  %512 = constant 512 : index
  %zero = constant dense <0.0> : tensor<512xf32>
  
  %o1,%o2 =
    scf.for %idx = %0 to %512 step %1
       iter_args(%acc1=%zero,%acc2=%zero)->(tensor<512xf32>,tensor<512xf32>) {
       
       %acc1_out,%acc2_out = call @process(%acc1,%acc2)
    	  :(tensor<512xf32>,tensor<512xf32>)->(tensor<512xf32>,tensor<512xf32>)
	
       scf.yield %acc1_out,%acc2_out:tensor<512xf32>,tensor<512xf32>
    }
  
  return %o1,%o2: tensor<512xf32>,tensor<512xf32>
}

On this code we apply the following compilation command:

mlir-opt debug.mlir --tensor-bufferize --tensor-constant-bufferize  --scf-bufferize --func-bufferize --buffer-results-to-out-params --finalizing-bufferize --buffer-deallocation | 
mlir-opt --convert-linalg-to-affine-loops --lower-affine --convert-scf-to-std | 
mlir-opt --canonicalize --convert-memref-to-llvm --canonicalize --convert-std-to-llvm --canonicalize --reconcile-unrealized-casts

The whole pipeline crashes at --reconcile-unrealized-casts because builtin.unrealized_conversion_cast remain in the code. It is likely that these operations remain because some memref.clone operations remain in the code, which --convert-memref-to-llvm or --canonicalize have not converted.

Funny enough, if we use separate tensor constants to initialize the two iteration arguments of the scf.for loop, the need for a memref.clone disappears, and so does the problem. But if the input is not a constant, but another tensor variables, this method would not work, so our problem remains (not to mention the fact that we should be able to compile correct code).

Our question is the following: Is it possible to compile our fragment by using other compilation options (how?), or we just stumbled on some bug/limitation of mlir-opt?

Our MLIR commit version is d9e46beace3120fbc4810dda5c3ed88f93e862a4. We would like, if possible, to remain on it, in order not to break other things…

Best regards,
Dumitru

dfki-jugr · November 10, 2021, 2:31pm

I had a closer look on the bufferization pipeline and I think the mentioned clones emitted by the buffer-deallocation pass are correct in this place. We introduce them, since the buffers are used as iteration arguments in a loop and we need to ensure that the program is still correct in all loop passes. In the current state, clones are usually collapsed by a canonicalization pass. This works fine for most of the emitted clones, but not in this special case. Here, we still need an additional buffer.
We can introduce an additional pass that converts all remaining clone operations into a alloc + copy operations that can be treated by other passes (memref to llvm e.g.) later on.

What do you think?

dpotop · November 19, 2021, 5:32pm

Yes, converting all remaining memref.clone into memref.alloc and memref.copy seems the good solution. I noted that this is the solution proposed by a new patch. Thanks a lot!

Topic		Replies	Views
Bug in func-bufferize MLIR	1	233	November 10, 2020
Tensor to memref conversion (a.k.a. bufferize) question MLIR	17	2191	November 10, 2020
What is the strategy for tensor->memref conversion? (bufferization) MLIR	25	2319	November 9, 2020
Facing issues with bufferization cloneOp MLIR mlir	0	58	April 9, 2024
[Bufferization] Unnecessary insertion of memref.copy introduced by recent changes upstream MLIR	5	260	February 9, 2023

Bufferization error related to ```memref.clone```

Related Topics