Rank-reducing memref.subview OffsetSizeAndStrideOpInterface interface issues

There is no such thing as input and output: the offset/size and stride properties belong to the op, not to the input or output memref.

It is a mistake to see these properties as being attached to the memref.

If you look more closely at the patch you’ll see that getting the n^th dynamic index is a property of the memref, we could indeed put that behind a helper function in ShapedType but since this need hasn’t materialized before I do not think it has more general usage.

The only interface that is accessed from OffsetSizeAndStrideOpInterface is the sizes() (dynamic) operands.

Also, I am unclear what you mean by:

The following is valid IR; you can choose to collapse or not collapse a particular 1 dimension (you still have to provide the proper type of the verifier will complain):

#map = affine_map<(d0, d1, d2)[s0, s1, s2, s3] -> (d0 * s1 + s0 + d1 * s2 + d2 * s3)>
module  {
  func @rank_reducing_subview_dim(%arg0: memref<?x?x?xf32>, %arg1: index, %arg2: index) -> index {
    %c0 = constant 0 : index
    %c1 = constant 1 : index
    %c4 = constant 4 : index
    %c2 = constant 2 : index
    %0 = memref.subview %arg0[%c0, %arg1, %c1] [%c4, 1, %arg2] [%c1, %c1, %c1] : memref<?x?x?xf32> to memref<?x1x?xf32, #map>
    %1 = memref.dim %0, %c2 : memref<?x1x?xf32, #map>
    return %1 : index
  }
}

You can also use linalg.collapse_shape / linalg.expand_shape to manipulate the type and drop / insert 1s. @pifon2a is currently working on moving those to the memref dialect as per [RFC] Reshape Ops Restructuring.

Those index manipulations in the patch are looking arcane to someone who is unaware of context. And they are needed at least in 2 places now (and possibly more in the future).

Regarding rank reduction, it wasn’t clear from docs that subview can do partial rank reduction (and I somehow completely missed linalg.collapse_shape).

Thanks

Fair enough, added a helper function to refactor that logic and give it a name.
PTAL.

@nicolasvasilache And what exactly happened in the review?

Also, I cheked linalg.collapse_shape and it seems it have llvm lowering only for fully static shape case, it won’t lower conversions like <1x1x?xi32><?xi32>.

Unfortunately this is proving more complex than the simple fix I sent in ⚙ D105558 [mlir][MemRef] Fix DimOp folding of OffsetSizeAndStrideInterface..

The problem with that fix is that it is not extensible:

  • OffsetSizeAndStrideOpInterface does not have enough information to resolve a result dimension:
    • for “extract” type op, it is simply the proper value in the sizes operand (what D105558 implements)
    • for “insert” type op, it just forwards to the “dest” operand (also what D105558 implements but by special casing)
  • special casing is not extensible and the above patch breaks this IREE Flow Op (i.e. core cannot special case and should use an interface)

The proper core interface to use for this seems to be InferTypeOpInterface.
However, folks have recently converged on moving the “canonicalization” of dim(InferTypeOpInterface) in a separate pass (⚙ D104321 [mlir] Move `memref.dim` canonicalization using `InferShapedTypeOpInterface` to a separate pass.).

It seems adopting this more generally would mean moving all DimOp canonicalization patterns to this pass (i.e. why should there be some producer ops that canonicalize automatically while some others require a special pass).

This brings me to trying to make ops that conform to OffsetSizeAndStrideOpInterface also conform to InferTypeOpInterface for the purpose of enabling the folding of DimOp.

This is however not straightforward because the result may be any type where 1’s are folded (i.e. if tensor<1x?x1x2xf32> is a valid return type then so is tensor<1x?x2xf32> and so is tensor<?x1x2xf32> or tensor<?x2xf32>).

Dropping the rank-reducing semantics does not appear like a good option because a combination of ops will be needed to achieve the same effect which will result in special-casing in transformations which is a red flag.

Another simple way would be to extend OffsetSizeAndStrideOpInterface with extra information to describe whether the op is an “insert”-like or an “extract”-like but this seems to go against the spirit of InferTypeOpInterface and ⚙ D104321 [mlir] Move `memref.dim` canonicalization using `InferShapedTypeOpInterface` to a separate pass..

In any case we should be consistent across our decisions, so my first questions comes from looking at Remove canonicalizer for memref.dim via ShapedTypeOpInterface.

How to we decide for a given op whether “folding” dim(op.result()):

  • should be a canonicalization (which implies that op must not be InferTypeOpInterface)
  • should not be a canonicalization ?

It seems there is an undesired coupling between “op semantics” (i.e. is it an InferTypeOpInterface) and “canonicalization vs pass”.

For example, it is unclear to me why AllocOp should not be InferTypeOpInterface.

Lastly, is there a particular recommendation how to address the original problem in this thread, in light of the discussion above?

@_sean_silva @MaheshRavishankar @herhut @stellaraccident

see my post above, the problem is generally more complex because we need extensibility and there are other moving parts in the system. Core can only special case for the ops it knows about.

Also @tpopp since the generalization aspect is related to ⚙ D103076 [mlir] Fold memref.dim of OffsetSizeAndStrideOpInterface outputs.

Thanks, as I mentioned before I workarounded original issue locally by using custom op to do rank reduction after subview. I have tried to replace custom op with linalg.collapse_shape but it don’t have llvm lowering for my specific case (collapsing dynamic dim with number of static 1s dims). Is there any active work on supporting this? Supporting this specific lowering case seems straightforward on the first glance and I can try to implement it myself.

IREE uses expand/collapse also with dynamic shapes but I think they lower out though a different path (which also works with Vulkan/SPIRV which have other constraints).

I don’t know of anyone actively looking at extending lowering of these ops to LLVM in core.
Adding support for more use cases would be a very welcome contribution, thanks much!

Could this become a memref.reshape? That already has a lowering.

So my take is that if it does not produce any new IR but just forwards operands, it is a canonicalization. The dim(alloc(...)) falls into that category.

Why can we not canonicalize for ops that implement InferTypeOpInterface?

This would even be a folding in my book.

Isn’t that what everyone agreed to in this post ?Remove canonicalizer for memref.dim via ShapedTypeOpInterface

memref.reshape requires identity layout map according to docs (which this is not true for subview result) but I can’t find if it enforced in code anywhere.

Actually, I am not sure about linalg.collapse_shape either, as it can work as view OR copy data depending on parameters, but I want to always have view in this case (and op having ViewLikeOpInterface)

I think the issue is that that interface does too much and a subset of it could be considered a canonicalization whereas most of it should not. The specific thing I think you (and I) are reaching for would be something like ResultDimMapsToOperandDimOpInterface (spelled out for emphasis). There is, in fact, one method on that interface that does this specifically, but iirc, it is defaulted in a way that makes it fall back to the general type inference methods, which can (and in some dialects) do anything, including creating arbitrary IR in other dialects, etc.

If we had that separation, I think there is a good argument that it is in the scope of canonicalization to replace a result dim with a (possibly scoped/trivial expression of a) corresponding operand dim.

1 Like

Answering a few general questions here (I looked through the post and the original patch, so have an idea of whats happening here, but dont want to delve into those details right now). Few related points on this AFAICS are

  1. My current view is that dim canonicalizations should really be a separate pass, and the whole list of DimOp::fold that does the “if (isa<op>()) {...}” should be deprecated. In general this is not a fold unless the dimension is static. For the dynamic case (which really should be the case to use for reasoning about folds/canonicalize/passes) resolving a dim of an operation in terms of its operands creates new dims. If it doesn’t in some cases, then special casing of that in fold is really confusing and does not have good separation of concerns.

  2. Resolving dims using the InferShapedTypeInterface is really a fixed point iteration computation. Piggy backing this on canonicalization is probably a mistake. Its doing too much at once. The canonicalization should be meant to make an operation get into its “canonical” form. Doing a fixed point iteration to resolve memref.dim operations along with canonicalizations is an overkill. As Stella found out, it is probably better to be more purposeful about when you want to resolve shapes. I would expect a client compilation pipeline to run canonicalization more frequently and shape resolution less frequently.

  3. These are sort of unrelated to the original problem mentioned above. I dont think there should be a dim canonicalization based on the OffsetSizeAndStrideInterface if it does not have enough information to compute the shape of the result. Should definitely drop that from the dim fold/canonicalization.

Yeah, I think a new interface (or new method on InferShapedTypeOpInterface) is needed here for the “sufficiently simple” cases that we want to be canonicalizations (probably the “resolves to existing operand” and “resolves to dim of existing operand”). OffsetSizeAndStrideOpInterface could provide helper functions, which assist in implementing this interface for “insert-like” and “extract-like” ops.

Although that will be interesting, because, for example, LinalgOp’s could implement SufficientlySimpleResultShapeInterface by always moving dim ops from results to “outs”, but I think we deliberately avoid that for other reasons.

Thanks, as I mentioned before I workarounded original issue locally by using custom op to do rank reduction after subview. I have tried to replace custom op with linalg.collapse_shape but it don’t have llvm lowering for my specific case (collapsing dynamic dim with number of static 1s dims). Is there any active work on supporting this? Supporting this specific lowering case seems straightforward on the first glance and I can try to implement it myself.

FYI, I will start working on linalg.collapse_shape with dynamic shape because of a requirement from npcomp soon (next week).

1 Like

I would really avoid this. Its not a canonicalization. It is not getting an op into its “canonical” form, rather doing something else. Can you provide a more concrete use case for this.

This is great! This is really needed for some downstream uses in IREE as well.