RFC: Symbol definition/declaration x visibility checks

Hi all,

When an operation declares or defines a symbol by implementing the Symbol interface, today we do not seem to do any verification to check if the visibility (public/nested/private) is valid w.r.t whether the operation declares vs defines a symbol. This issue came up recently during a review (https://reviews.llvm.org/D90337) and it seemed appropriate to have a separate discussion since this likely affects all operations that introduce new symbols (and hence implement the Symbol interface).

Note that declaration vs definition is expressed in the IR as isExternal(). For example, FuncOp::isExternal(). The operation defines a symbol if !isExternal() else it declares a symbol. In the text below, external symbol means a symbol not defined in the current IR.

If an operation defines a new symbol, then it’s conceivable that the definition can have all 3 visibility attributes. If its private, only operations nested under the parent operation can reference that symbol by name. If its nested, any operation in the current IR can reference it by name. And if its public, an operation external to the current IR can reference it by name as well.

However, when it declares a symbol, it seems not all visibility values make sense. The declaration in a sense is importing the symbol from an external IR into the current IR (we would never need to use a declaration for using a symbol that is defined in the current IR). This declaration being private means that the import makes the symbol accessible only to operations nested in the parent op, and if the declaration is nested, it means that any operation in the current IR can access the symbol through that import. However, if a declaration has public visibility, it seems the operation is importing a external symbol and then intending that operations external to the current IR reference the symbol through this import. But that does not make sense, since if an external operation wanted to reference an external symbol, it will not do that using a declaration in the current IR. So it seems verification should disallow declarations that have public visibility.

If that seems reasonable, the next question is whether this can be done as a part of the Symbol interface itself. The interface today has knowledge of visibility but does not know if a particular operation is a declaration or definition. One option is to extend the interface to include that information (as an isExternal() query) and build this verification in the Symbol interface itself. Other option is to add custom verification for this to all Symbol operations, but folding this into the Symbol interface seems preferable.

I agree here, I don’t have an objection to adding it to the Symbol interface.

+1 from me. I see declarations as a natural extension to the interface. One caveat is that I would prefer isDeclaration naming, instead of isExternal.

I agree with the naming. isExternal can be confusing when its next to isPublic etc in the Symbol interface.

I started looking into this and we have several tests that fail this verification because of function declarations. The way to fix that is to add a attribute dictionary to every declaration in the tests:

attributes {sym_visibility = "private"}

which is quite verbose. So I am proposing in the interest of not redoing this, we change the print/parse of functions to include the visibility attribute after func and then drop it from the attribute dictionary when printing. So instead of:

func @print_memref_f32(memref<*xf32>) attributes {sym_visibility = "private"}

it will be

func "private" @print_memref_f32(memref<*xf32>)

Note that the parsing code can be modified in such a way that it will accept both this new syntax as well as the existing syntax where visibility is in the attribute dict at the end. And with recent changes to verify that attributes cannot appear twice after custom parsing, it should also catch errors if visibility is provided both after func and in the attribute dict.

Thoughts?

Do we need the quote around “private”? There are only three possibilities for the visibility.

FWIW, I acknowledge that these 2 are independent. But if the new FuncOp syntax seems ok, we might implement that first and then use the less verbose syntax to upgrade the tests for the symbol visibility verification changes.

The quote around private was a bring-over from similar syntax for global_memref. There we have the quote because the visibility is generated by the declarative printer, and it always puts quotes around String attributes. FuncOp uses a fully custom parser, so it could do without, but that introduces divergence between the syntax of the two symbol operations in the standard dialect.

I did a prototype implementation of the syntax change, and although its transparent on the parsing side, the printer by default will use the new syntax, so the test still need to be updated to use the new syntax in checks. I’ll post it for review to see what folks think.

Another thing I want to get opinion: If post parsing, a symbol happens to be a declaration and no explicit visibility attribute was specified, would it make sense to default to non-public visibility? Say private? Today, the visibility is public by default.

The (weak) advantage is things can get inferred by default, so existing syntax continues to work. I could see opinion against it as it might be confusing if we also do not print the default visibility (so based on whether its declaration or definition, almost similar looking IR has one symbol as private and other as public):

global_memref @x : memref<f32>                 // inferred private
global_memref @y : memref<f32> = uninitialized // inferred public

Being explicit is probably better here while accepting some redundancy, but wanted to check if there are other thoughts. Explicit syntax will continue to work in any case.

Regardless of if we accept this in the parser, I hope the printer would stay more explicit.

I’d expect both printer and parser to have similar behavior (i.e., if default of private can be inferred without being explicit, then the printer would also not print that default). That’s not the case today because there is just one universal default: public. I’d expect we do not want to be a place where:

func @foo()

is inferred as private but then printed as:

func "private" @foo()

Random thing, but I’d love for there to be a way for the ODS asmprinter/parser to support “barewords” (unquoted identifiers) for enums in unambiguous cases like this.

Since we talk about an IR, I personally would prefer more explicit forms and error reporting to complex inference rules that one has to constantly keep in mind. So if a symbol is declared and it should not be public, I would just report an error of the kind “symbol declarations cannot have public visibility (Note: in absence of visibility attribute, the symbol is considered public)”. We already have such a behavior for implicit terminators in some regions.

Big +1 here. I have a bunch of these in the LLVM dialect. Similarly, cmpi/cmpf should be updated to the non-quoted syntax (we just did not have enough parser support when we added those, and everything else just followed the template).