On querying an Operation's intrinsic (core) vs external/user-defined attributes

This issue came up in the context of this revision: ⚙ D111837 Retain attributes of original scf::ForOp when folding, and it’d be good to get thoughts on this. It’s on the propagation of user-defined attributes when ops are replaced during rewrites and transforms.

It looks we need methods on derived op types: getIntrinsicAttributes() and getExternalAttributes(); getIntrinsicAttributes() would return the attribute names (or attributes) that are intrinsic/core to the op (part of the op’s definition), and getExternalAttributes() returns attributes that are user-defined/not part of the op’s definition. E.g., in the context of the revision above where the external attributes were being dropped during replacement on a rewrite pattern: since the op’s iter args, results, and iter operands are being trimmed and the op doesn’t have any intrinsic attributes that correspond to the cross-iter value flow/init and yield, the compiler developer of this pattern knows that it’s correct to propagate all external attributes as is. An inherent assumption here is that the transforms and rewrites on the ops have the guarantee that any user-defined and external attributes by design don’t have an impact on the core semantics of the op, and it should be correct for them to propagate them as is. Had those attributes been core attributes, they’d have had to be appropriately handled.

A corollary of this is that those attaching user-defined attributes shouldn’t expect any special handling and treatment for the propagation of those. For the above revision, had there been a getExternalAttributes(), those attributes could have been obtained and just propagated. Since scf.for has zero intrinsic attributes, just using getAttrs over there would be fine for now.

Unless I missed it, I don’t see any methods/support for this in the codebase. There is no OpInterface that could enable such support, nor are there any virtual methods for it in the Op or Operation class.

Other notes:

  1. If we don’t want to require every op to have such accessors, this could be enabled via an OpInterface.

  2. If this is enforced for every op (derived op type), every non-tablegen’ed op would have to define such methods. For tablegen’ed ops, these could be auto-generated. There may be cases where this information may be useful even for certain OpInterfaces, but it’s not needed or perhaps even practical to support this on an Operation *. Also, this information isn’t obviously available for unregistered ops.

  3. If all of the above makes sense, it’d be good also to clarify the behavior users could expect from external attributes vis-a-vis transforms/rewrites.

Hi Uday,

I’m not sure what you’re suggesting. It isn’t safe and I don’t support having “unknown attributes” be propagated through folds and canonicalizations automatically. Are you saying that you just want a method on Operation to make it easier to implement manual propagation in known ok cases?

To just pick a specific example, in the scf.for patch, why is it ok to forward unknown attributes? Those attributes could depend on invariants that the newly transformed op doesn’t provide.

-Chris

1 Like

I’m on the same line as Chris here.

Previous related discussion also:

And LangRef: MLIR Language Reference - MLIR ; which I believe names “discardable attributes” what is referred to here via getExternalAttributes()

A fold doesn’t and can’t create new IR (only in-place updates or folding away) - the question of propagation is moot there. There’s nothing specific w.r.t folds and canonicalization in what I was describing. I was suggesting that it should be considered safe to propagate all external attributes “as is” through any replacements (canonicalizations, transform utilities, pass op rewrite patterns) – if a user wanted special treatment for those, they shouldn’t have been using external attributes, right? I was wondering why this isn’t a reasonable contract to have? (typically for 1:1 op replacements where the source and target op types are the same)

I didn’t understand what you meant by “manual propagation” here. Propagation of intrinsic attributes would have to be anyway done taking care of their semantics. The method/hook I’m suggesting would allow you to get hold of all external attributes.

This whole topic is quite related to the design of LLVM metadata which hasn’t aged well. You have three basic policies that you can put into canonicalizations and folds when you encounter “extra info” on nodes that you’re working with:

  1. block the transformation
  2. implicitly drop the metadata
  3. implicitly propagate

The right answer depends on the metadata in question. If you have attrs that changes the semantics of the nodes in question or which is required to be propagated for correctness, then #1 is generally the right thing to do.

If you don’t know anything about the metadata, #2 is the conservatively correct thing (this is what MLIR does, and what LLVM does for its metadata). This is important because there are many different kinds of canonicalizations (for example) including ones that walk up the SSA graph and change “used” operations: e.g. strength reducing an operand because you know the only user doesn’t need all the bits being produced. These sorts of transformations scramble “arbitrary unknown” metadata because the properties they reflect may not hold anymore.

The “implicitly propagate” policy can make sense for certain kinds of things - for example, if you’re trying to propagate hints about loop performance (rolls a lot, rolls only a few times, etc) then it can be useful to carry them forward as loops are transformed. The problem with this though is that you can’t differentiate between friendly and unfriendly to propagate metadata.

As a consequence of this, we implicitly drop out of conservatism (we aren’t perfect about this though - eg when nodes are mutated in place). We’ve discussed multiple times that it would be possible to make a scheme for attributes to help differentiate between the cases, but no one has had time or interest in baking out that model. It is also very difficult and nuanced to do.

-Chris

Hi Chris,

My original post was focused purely on replacements where you replace an op of OpTy with another op of the same OpTy - sorry, if it didn’t make that clear. This is the scenario encountered in the PR ⚙ D111837 Retain attributes of original scf::ForOp when folding On this PR, @mehdi_amini states:
“the safe things generic transformations should do is discard unknown attributes.”

And I would argue the other way that in the above scenario, the safe or safer thing is to propagate attributes as is.

  1. Note that when you have a folding hook or a rewrite/transformation that is updating an op in-place, you never check (and wouldn’t want to) as to what external attributes the op has and drop them. All in-place fold hooks and perhaps all rewrites in MLIR where this happens just update the op in-place when they have to. So, we are already implicitly propagating any external/user-defined attributes on in-place rewriting. Stating that you’d want to do something different when it’s a replacement in a canonicalization pattern or another utility is inconsistent with the former pervasive behavior. How do you reconcile the two?

  2. In the above scenario of replacement, if one argues that propagating external attributes is unsafe, one could argue that dropping them is more unsafe. In order to propagate them during replacement, we need a hook to get hold of the external attributes. This is also useful in other cases where it’s not a 1:1 replacement but the author of the rewrite knows it’s safe to propagate. (Perhaps this is what you referred to earlier.)

  3. If all else is the same on safety, propagating prevents loss of information and provides the desired functionality to pass pipeline developers who are using these external attributes to propagate short-lived information. The risks of using external attributes unknown to utilities are already clear to developers: there shouldn’t be any surprises if they are propagated like in the scf.for rewrite above. This is also consistent with everything else already happening during in-place update of ops.

Let’s try to resolve ⚙ D111837 Retain attributes of original scf::ForOp when folding before a more general discussion on attribute propagation.

Why wouldn’t we want to? Seems like necessary to be safe in general (in the absence of a taxonomy of attributes and some way to differentiate).

Can you expand on this? This goes against my understanding of attributes, and as I mentioned before even the LangRef refers to these as “discardable” instead of “external”.

Any discardable annotation can be invalidated by the change in number of arguments here.

This is back to the same question that I was answering above in detail.

The LangRef description isn’t guarded by the context of this discussion. Obviously, “discardable” doesn’t mean “must discard” nor safer to discard than to propagate in all scenarios.

Yes, but the point I’ve been making here is that it’s not the rewriting code’s problem if it that information became invalid because it’s an external attribute – it shouldn’t have to care about it. You already don’t care about it throughout the codebase when you are doing in-place rewrites. Trying to update all code updating ops in-place to look and check for external attributes and discard them is both impractical and more importantly meaningless.

It means at least that they are “safe to discard” I believe, which contradicts your claim as I read it: “one could argue that dropping them is more unsafe”.

You’re arguing for 3 in Chris’ list, but Chris explained what is the problem with this.

I consider this a problem / bug though.

Right, I can see what you’re saying. The problem is that we “don’t know” what situation we’re in, so the infra has to make a decision about what to do in the absence of that knowledge.

Just to briefly explain what “safe” means when I use it: “extra” attributes aren’t really allowed to add new semantics to an op they are attached to. Transformations won’t know about them, they won’t be reliably propagated in all cases, and macro changes to the IR can break assumptions the attribute is trying to convey. In the case of LLVM, these problems didn’t usually manifest in first order issues, but in second order issues that only popped up with weird combinations of transformations. A consequence of this idea (that extra attributes are hints and can’t carry semantics) is that the “safe by default” thing to do is to drop them.

That said, I understand that you have a specific goal in mind and propagation is probably the right thing to do for these use cases. I can think of two specific paths forward:

  1. We can define a schema for attributes, so the attributes themselves carry information about “I am safe to implicitly propagate”. We would nail down what that exactly means, but if we had that, you could put this bit on the attributes in question and we could make the infra look for it and propagate “just these” attributes.

  2. You could change scf.for to have an explicit array attribute which would usually be empty, but could have other “propagate-able hints”. The rewrites for scf.for would always manually move these things over, and we’d use the array attribute as the thing that distinguishes between “propagate-able” and not. This is a lower impact change on the system, that could build experience towards building the larger system.

-Chris

2 Likes

Not really. I don’t see Chris’ earlier explanation earlier any of that. See Chris’ comment further downthread, but even here, please see counter-arguments below.

But I don’t see any valid argument that you present above as to why it’s a problem/bug. It’s not just an in-place rewrite but every transformation or utility that is using clone on ops/blocks, takeBody, or for that matter moving ops between blocks is currently preserving external attributes. Saying that you’d want to discard all external attributes each time you want to mutate IR and in all cases such mutation is happening really has no rationale AFAICS and is worse by all means than propagating them in the set of cases I’m identifying. (For eg., you won’t even be able to debug IR inside of a pass calling a sequence of utilities because such debug attributes would get dropped within it but this is just a side note.)

@clattner This is where I’m getting stuck. On the one hand, you state in the first line, “extra attributes aren’t really allowed to add new semantics …”, and then to the contrary that “and macro changes to the IR can break assumptions the attribute is trying to convey” – how could an external attribute convey anything given that by definition it can’t add semantics and is considered “discardable”?! It can’t and shouldn’t expect anything to be conveyed. Are you instead referring to intrinsic attributes that a user has added on the op’s definition without worrying about its impact on transformations and rest of the infra? Those are definitely a problem, but propagation of external attributes in 1:1 replacements, in-place rewrites, and in utilities that are moving and cloning IR (which is all pervasively and correctly happening) isn’t a problem at all as far as I can see.

I guess it does not come as obvious to you as it is to me but here is the logical sequence:

  1. LangRef: discardable attributes have semantics defined externally to the operation itself, but must be compatible with the operations’s semantics.
    That does not mean that these attributes don’t have any semantics information, but that they have to be compatible with the operation semantics itself: they can “augment it” as long as they don’t contradict any conservative handling of the operation that wouldn’t know about these “discardable attributes”.
    Also by definition we should be able to safely discard them at any time.

  2. Since these attributes can be used to augment the semantics of an operation, I would think it’d be legal to add a #mydialect.parallel attribute to scf.for when lowering a higher-level operation to scf.for loop, assuming the lowering knows the loop can be parallel.
    This does not contradict the semantics of scf.for, as long as a sequential execution of the loop is semantically equivalent.

  3. An “in-place” update or canonicalization could introduce new dependencies across iterations: since it does not have to know about or honor the #mydialect.parallel attribute, it seems perfectly fine to do so. It is valid to just consider scf.for sequential and perform a transformation that would lose parallelism as a side-effect.
    At this point your in-place update would leave this scf.for with the #mydialect.parallel attribute, that I may use in my subsequent pass to lower such loop to a parallel form (maybe with my own runtime/dispatch), incorrectly so.

Locations aren’t attributes, are you thinking of something else?

The attributes we’re working with that led to ⚙ D111837 Retain attributes of original scf::ForOp when folding fall into the category of “safe to discard” as far as preserving semantics, but they are very important hints that we rely on for later passes. I also don’t think they fall into the category of “may have an invariant broken by one of the canonicalization transformations” since they don’t say anything about the semantics of the loop itself.

I’d be content with either of these approaches, since this is currently blocking some of our work. Since (2.) is less heavy-handed, perhaps that can be implemented in the meantime while a longer-term plan gets settled? Also looking for other suggestions as to how this can get us unblocked.

I’d be fine with an explicit array for annotation on scf.for for now, if others are OK with it!

1 Like

Could you elaborate a bit on what these hints convey concretely ?

Mostly they are talking about loop unrolling - how much to unroll, whether we should enable/forbid unrolling, etc.

You are making a big leap from (1) to (2) and AFAICS this isn’t really valid reasoning. If a user is trying to add extra attributes to the op, then a pass registered on the op’s dialect, a rewrite pattern (a fold, canonicalizer) registered on the op, etc. can’t be expected to know about it. The conclusion you are making in (2) from
“Since these attributes can be used to augment the semantics of an operation”
is problematic and contradictory: those semantics are what the user perhaps wishes and not what the infrastructure (like the one linked in the PR above) needs to care about. This also directly contradicts what @clattner says above: “Just to briefly explain what “safe” means when I use it: “extra” attributes aren’t really allowed to add new semantics to an op they are attached to.”

Could we converge on this first before proposing or supporting a solution? I think it’s pretty clear the external attributes don’t add any semantics to the op.

I wasn’t referring to locations - it could be any transient information attached by a pass/utility author for debugging: to see what’s going where.

There are pretty direct questions above that I don’t see an answer to – I’d prefer not heading in this direction. The patch submitted (⚙ D111837 Retain attributes of original scf::ForOp when folding) is as is fine.

Was there a typo here? For me it isn’t clear how they don’t in general, or are you referring to this exact case?

That creates a non-uniform situation which will be surprising to users. Suddenly for will do things the rest don’t. Making it explicit seems least surprising and intrusive until the open questions have been discussed IMHO.

I wonder how much of the need here is related to analysis retained through folding, canonicalization and general. We already have that facility but each pass has to opt-in to what analysis it retains. And so these are serialized analysis that are more sticky.

I’m puzzled why your quoting this sentence, but not the paragraph I wrote right before it, which I think explain how I see things entirely differently from you here:

That does not mean that these attributes don’t have any semantics information, but that they have to be compatible with the operation semantics itself: they can “augment it” as long as they don’t contradict any conservative handling of the operation that wouldn’t know about these “discardable attributes”.

So I’m a bit puzzled here, because you really didn’t answer or address the specific nuance I introduced.

That para is equally contradictory and in the same way.

This still contradicts the fact that the external attributes can’t expect to add new semantics that the infrastructure should have to specially care about – which is also why I believe they were called discardable. I still don’t think you have a single valid reason here as to why external attributes should be dropped as opposed to being preserved.

@jpienaar But in a large number of cases where you are doing in-place mutation (please see upthread), you are already preserving such external attributes because, at all those places, the pass/utility/rewrite doesn’t care (and shouldn’t) about the presence of such external attributes. So dropping them would in fact be inconsistent with all of those and preserving would be consistent. The patch is doing the latter – that’s neither wrong but consistent with certain existing behavior and I agree inconsistent with others. In which direction would you make the argument on uniformity?

We are going in circles. To make things concrete, let me separate the “what” from the “why”. Does everyone agree that it’d be useful to have these two accessors auto-generated on an op:

void getExternalAttributes(SmallVectorImpl<Attribute> &attrs);
void getIntrinsicAttributes(SmallVectorImpl<Attribute> &attrs);

Both can be auto-generated on a derived op OpTy.