[RFC] Verification Order In MLIR

Introduction

We don’t have a clear definition of how a verification is done, things include the ordering of all those verifications, how to handle the dependency of verifier between the trait/interface, .etc. Besides, users don’t know what things are verified before calling their own verifier so that we may end up verifying the same things many times. An even worse case is that it may cause a crash in the verifier.

Background / Motivation

The current operation verification process(OperationVerifier::verifyOperation()) includes steps,

  1. Invariant verification. The order is,
    1. Operation structure verifiers come first, e.g. OneRegion.
    2. Other traits/interface verifiers will be invoked.
    3. Operand/Result verifier
    4. user defined verifier
  2. Region/block verification

We have seen some issues caused by the above verification scheme and those issues can be categorized into two kinds,

Verification dependency between two verifiers

A verifier of an interface may assume another interface/trait or user defined verifier has been verified and would like to access the property of that interface. Here’s an example,

RegionBranchOpInterface

The user’s case is like, it puts the RegionBranchOpInterface::verifyTypes in the interface verifier declaration and has a user defined verifier for operation,

static RegionBranchOpInterfaceTrait::verifyTrait(::mlir::Operation *op) {
  return ::RegionBranchOpInterface::verifyTypes(op);
}
static LogicalResult verify(IfOp op) {
  if (op.getNumResults() != 0 && op.elseRegion().empty())
    return op.emitOpError("must have an else block if defining values");
  return success();
}

They were expecting that the user defined verifier will be run before the interface verifier. But in fact, the interface verifier will be run before the user defined verifier. In this case, it results in a crash if the elseRegion is empty.

Verification dependency with region ops

SymbolTable

LogicalResult detail::verifySymbolTable(Operation *op) {
  // ...
  auto verifySymbolUserFn = [&](Operation *op) -> Optional<WalkResult> {
    if (SymbolUserOpInterface user = dyn_cast<SymbolUserOpInterface>(op))
      return WalkResult(user.verifySymbolUses(symbolTable));
    return WalkResult::advance();
  };

  Optional<WalkResult> result =
      walkSymbolTable(op->getRegions(), verifySymbolUserFn);

In this case, SymbolTable will try to verify the region ops before they get verified. It may lead to problem like crash if we try to manipulate a malformed region operation.

Proposal

I would like to propose an explicit ordering of verifier execution. According to the second case above, we know that the verification of nested operations is required for certain verifiers. Hence, the first thing is to divide the verifiers into two groups,

Running before/after nested operations

In ODS, add a field verifyGroup to specify if this verifier should be run before or after the verifiers of nested operations.

// 0 for run-before, 1 for run-after. This should be explicitly specified if there's a verifier defined.
bit verifyGroup = ?;

For verifiers in the same group, they will follow the ordering listed down below,

Verification order in verifyGroup

  1. Operation structure verifier
  2. trait/Interface verifier
  3. User defined verifier

Operation structure verifier

The traits defined in OpDefinition.h or the operand/result/attribute verifiers generated by tblgen. So far, verifiers in this category will be also in run-before verifyGroup.

Trait/Interface verifier

Most traits are used to specify special properties and which usually need to be verified based on the result of the operation structure verifier. Interface is supposed to provide a generic way of interacting with an operation, thus it’ll also expect a valid operation for it’s manipulation. Like the case mentioned above, the RegionBranchOpInterface::verifyType() expects an operation to have its region number and emptiness of region verified.

User defined verifier

The last verifier to be invoked is the user defined verifier. The user defined verifier is usually supposed to verify the things which can’t be scoped by interfaces or taits. Put it at the last step so that it can do the verification based on most facts.

Add dependency between traits and interface

Trait/interface may depend on the property of another trait/interface and the above ordering is not yet enough to cover all of these cases, for example, if it is the dependency between two interfaces. Here gives two ways of specifying the verification dependency.

The order of interface/trait declarations implies the verification dependency

def Tensor_InsertSliceOp : BaseOpWithOffsetSizesAndStrides<
    Tensor_Dialect, "insert_slice",
    [NoSideEffect,AttrSizedOperandSegments,OffsetSizeAndStrideOpInterface,
     DeclareOpInterfaceMethods<ReifyRankedShapedTypeOpInterface>,
     TypesMatchWith<"expected result type to match dest type",
                    "dest", "result", "$_self">]> {
     ...
}

AttrSizedOperandSegments will be verified before OffsetSizeAndStrideOpInterface because the current verification framework handles them in the declaration order.

Note that this is implicitly to be true now because of the implementation. Explicitly claim that is supported here.

Verification dependency list

Trait/Interface can specify the dependent traits/interfaces verification.

// The list of traits/interfaces which will have their verifier run before you
list<OpTrait> dependencyList;

Some traits/interfaces may specify this in their verifier, for example, it may use static_assert(ConcreteType::template hasTrait<Foo>()) to tell the dependency. It will be encouraged to move this to the dependencyList.

With these two methods, we will have an explicit verifier execution order and be able to find out the dependency violation statically.

Conclusion

This verification specification is supposed to give the user a clear way to organize the op’s verification. Then we can have a more robust and efficient verifier. Note that all other things not specified here will be viewed as undefined behaviors.

PS: I try to focus on the high level concept first so that I skip some implementation details like how we write the dependent list for traits defined in C++.

1 Like

+1. An unclear verification order has lead to many strange and confusing bugs. Inflexibility in the verification ordering has also precluded some features.

In particular, I am trying to implement Attribute subrange lookup in ODS-generated attribute getters, but I cannot do so without introducing potentially confusing bugs due to the current verification ordering – Operation structure verification happens after traits!

Is this field added to the op class? Can an op have both a before- and after- verifier?

I take it that the ODS-generated verifiers will no longer be lumped in to the user-defined one?

Dialects and passes have dependentDialects, so how about calling this one dependentTraits? I like the idea of allowing traits to depend on each other, so that verification does not get duplicated across traits.

How will you ensure that the traits are specified in the correct order? E.g. if trait B depends on trait A, can you ensure that [B, A] results in an error but [A, B] does not?

I think this proposal is a good starting point on fixing/improving the current state of verification, so +1 in general.

+1

I would just insert static asserts that the dependent traits appear first in the trait list.

Can we avoid the magic numbers? Is there a way to use named values for this?

Also the name verifyGroup doesn’t really seem related to the fact that this is indicating if the verifier is run before or after nested operations. Fixing the magic number issue may help here.

I’m not sure if this is mentioned somewhere, but I would expect AttrSizedOperandSegments(and the similar traits) to be verified with the other structural traits; i.e. I wouldn’t expect traits to have explicit dependencies on it.

– River

Thats a good point and feeds also in with Jeff’s question around ODS generated verifiers bit being lumped in with user ones.