@stellaraccident’s comment here made me think we should have a discussion about how we handle dialects in the C API.
Currently, we have a loosely associated set of functions for registration, loading and accessing the namespace of a dialect, and we only provide this for the Standard dialect. There is a catch-all mlirRegisterAllDialects function which registers everything, but if you only want a subset of dialects (for instance just “std” and “scf”) you have to create a new CAPI library (either for the individual dialects or for the bundle of dialects you desire).
I think the best path forward is to use static constructors to create a global namespace-to-dialect mapping which can be utilized by the C API to provide MlirLogicalResult mlirContextRegisterDialect(MlirContext, MlirStringRef namespace). I realize static constructors are a little evil, so I recommend we make this global mapping optional using a compile-time flag like MLIR_STATIC_DIALECT_MAPPING_ENABLED. If enabled, MLIR should crash on startup if two dialects exist with the same namespace.
I am quite strongly opposed to bringing global constructor based registration now that we eliminated it all from MLIR. I rather push this to the client if they want it and never provide this or relying on this kind of things with upstream APIs.
As an aside, I think there is a middle way here, which I was hinting at in the revision. There is something potentially nice about having the dialect registration behind a C API (ie. The act of using becomes the act of linking, and you don’t have binaries that just grow without bound). But I think that if we want to make dialects more pluggable, we need a better API: in the limit, the right API would let someone dlopen a shared library, scan for the dialect registration hooks and then load/initialize them generically. Such a mechanism also allows for static bundling at the right level just by listing externs in an accessible place.
I’m not at my computer right now but can try to sketch something out later.
I am quite strongly opposed to bringing global constructor based registration now that we eliminated it all from MLIR
I’m sorry but I don’t have the historical context here, can you please elaborate on what the downsides would be? From my experience, the downsides to static constructors are: 1) things run single-threaded before main, so they can slow down launch and 2) the order static constructors are invoked in is sometimes surprising and hard to reason about. I think an off-by-default semantic and crashing on reused namespaces addresses these issues.
@stellaraccident I haven’t done much binary reflection like your suggesting, so I’m interested in what you come up with! Would this be implementable directly in the C API?
Mehdi (and others) spent a significant amount of time scalpaling out some rather prolific uses of static constructors for the various registrations in MLIR. Static constructor/registration systems are really convenient to get started but tend to create monsters in a few additional ways:
Makes it very hard to reason about what should be in the various binaries, since a lot of things end up along for the ride.
Often degrades over time to needing a second registration like system to opt things out in certain situations.
Working against how the build system wants to link things (i.e. requiring machinery to do whole archiving linking, which varies by platform).
So, I think we are saying “static constructor” here but actually referring to a static-constructor based registration system.
My personal opinion is that core APIs should be factored so as not to introduce static registration systems and then, when absolutely necessary, something further up may introduce one for its own needs. We’ve been kicking the dialect registration can down the road to see where that point may be.
Speaking for the Python API, since we are solidly in dynamic linking territory there anyway, I would far rather see a convention for building/naming dialects and conversion libraries as shared-objects/DLLs and then using ctypes to dynamically load them given some kind of search path at runtime. For that to work without scaling to require a python extension module per dialect/conversion, you need some level of simplicity/opaqueness to the registration APIs that can be introspected and handled at runtime in a very coarse manner.
As an example (essentially inverted from the example present now for the std dialect):
If the context exposed such an interface, parameterized via an opaque struct (i.e. we would hide the struct details in the impl so that user code just gets a strongly typed void* equivalent), then this nicely separates the problem of how I get a hold of the dialect registration vs what I do with it: if I have some generic way to get the registration as a void*, everything else falls out. In Python, we would likely expose this as a Capsule wrapping the void*.
Now, as an approach for discovering these in the dynamic linking case, a convention would help: let’s say that for a dialect with namespace std, the accessor for its registration hook is:
Now, in the dynamic linking case, I have to write no more native code to grab such registration hook accessors from any shared library. In Python, if I load the shared library, I just scan through the attributes and find symbols that match the above, calling each and stashing them in a process-global dict[str, Capsule]. And now purely in the Python bindings, I can have a context.registerAllDialects that does what I want and has no linkage level static registration: it would all just be a handful of lines of code.
Taking this further, we may want to consider that we bundle a certain number of dialects in the “core” shared library (i.e. MlirPublicAPI.so today). For this to work, I just need to make sure that it exports each of the corresponding mlirDialectRegistrationHookGet_* methods for the bundled dialects. Then as part of initialization of the mlir python module, I scan MlirPublicAPI.so for registration hooks, just like any other shared library (there may be some additional Python logic to make sure to find it in the proper installed location).
For things that need to operate with purely static linkage, they can use the same hooks but will need to bring their own facility for enumerating them (i.e. they could just declare them in a static array or they could have their own static registration thingy). I don’t live in that world and don’t have a specific approach in mind but I do believe it would layer cleanly on top of what I propose above.
Thinking about it a bit more, if the C++ library defining the dialect defined the mlirDialectRegistrationHookGet_* symbol, that would work for me because I can just have a header file in my bindings that lists all of the symbols for dialects I’m interested in. I would love to avoid creating a new C library per dialect.
Right, this is one way of pushing the registration logic down to the client.
That said from the client point of view I don’t quite get the fundamental difference between:
auto *hook = mlirDialectRegistrationHookGet_std();
auto *hook = mlirDialectRegister_std(context);
It seems to me that in both cases you relying on scanning a symbol table for finding a symbol with a known prefix.
Global constructor are fundamentally “anti-library” IMO: they are process wide and prevent modularity leading to monolithic solution. They also require either build system tricks (force link them in the binary) or rely on somehow fragile / awkward mechanism (at which point there is no advantage left over explicit registration).
Ultimately it is just a “convenience” that “hides” something from the client, which makes it look like more ergonomic at first (the “hello world” is leaner) but with actual adverse effects (anything beyond “hello world” will have to be “fighting against the system” or even unable to break down the monolith).
There is no functional difference - just a practical one: since such binary symbol scanning and FFI-based invocation tends to be a fragile point in the system, I generally prefer the slimmest possible API “neck” at that point. This is partly because it makes it too simple to fail and partly because the publisher (MLIR) has more control to fortify the situation (i.e. it could embed a version nonce in the struct or do something else to guard against bad dynamic-dispatch – at least to the level of aborting nicely instead of formatting your hard drive).
Having the dynamically resolved function take no arguments and only return a void* (or equivalently treatable pointer) is the essence of simplicity and puts the full onus on the part of the API where we aren’t bit-banging the function calls.
Either variant is fine, though, and gets it to where I’d like to go. I think that we should move towards this (vs the current setup in StandardDialect.h).
To George’s point, I’d be tempted to emit this C-symbol as part of the normal tablegen-generation for a dialect in the .inc.cc file (which today would mean that they would go in libMLIR.so if it is being built, and tomorrow, they may go into a dialect-specific shared library).
This all makes sense to me. From my perspective, the steps forward are:
Define a C++ class DialectMetadata which stores and provides access to closures for the basic dialect operations (register, load, and getNamespace). (I don’t love the term “metadata” but it is the least offensive term I could think of)
Make the tablegen emit a C function void *mlirDialectMetadataGet_<namespace> which returns a pointer to a static instance of DialectMetadata for each dialect.
Create an MlirDialectMetadata type in CAPI which is structurally equivalent to a pointer to a DialectMetadata.
Create C functions which given an MlirDialectMetadata perform the basic functions of register, load and getNamespace
I’ll probably need some assistance with the TableGen bit, but I could take a stab at this this week.
Maybe DialectRegistrationParams? Should probably also include TypeIDvoid* (see TypeID::getAsOpaquePointer()) for the dialect being registered. I’ll probably also add a type id for MlirContext and use this as a check that shared libraries actually linked in the way expected (vs winding up in separate namespaces). Both would help detect subtle linkage issues that arise.
Shouldn’t be too bad and one of us can help. May also want to let this sit for the evening for others to read/comment.
I’m admittedly bikeshedding here, but the reason I was unsatisfied with DialectRegistrationFoo is that things like getNamespace don’t really have anything to with metadata. That said, I don’t particularly care (I think in the Swift bindings the “metadata” bit will become Dialect and the “loaded” version won’t be bridged and I will grow to regret this decision over time).
I think we should start with maintaining the current set of functionality (register, load, getNamespace) and new functionality can be added in a subsequent PR.
I’m started poking around the tablegen and DialectGen.cpp seems relevant to my quest. Unfortunately, this seems to generate a header file which doesn’t help much since we need the function to be implemented in a source file (otherwise the symbol won’t be visible or will be implemented multiple times). As far as I can tell, there is no automatically-generated implementation for dialects, is this correct?
Also, even if we did automatically generate an implementation .inc, I’m not sure how we can guarantee that dialect authors will remember to include that file. Admittedly, I haven’t touched C++ in several years so I may be missing some bit of magic we can do. I’m beginning to think what we could do is provide something simpler like a #define DIALECT_C_API_REGISTRATION_HOOK(ClassType) in Dialect.h macro and rely on dialect authors to add that to their implementations.
Unfortunately, not that I know of. I’ve been unable to take a pass at this for the past few weeks and have resorted to making one-off C bindings for the dialects I’m interested in. It isn’t the most elegant solution but it works for now.
One thing I’ve noticed is that since there isn’t a generic mechanism for types and attributes, most bindings will require some manual C bridging anyway (unless you are ok with just using “parse”). In these situations bridging the dialect itself doesn’t actually add much overhead.
Yeah, I’ve been wondering that myself as working on the npcomp dialects (which have custom registration and typically some accessors for types/attributes, as you say).
If that’s the case, then this primarily becomes a code organization mechanism within the core dialects then. It would be nice to have a pattern, though, that downstreams could follow – versus just being completely ad-hoc. I ultimately want to be able to generate standalone shared libraries for each dialect, and that is going to be a mess without some standard way of doing it.
I think that each C-exported dialect really needs to grow its own capi header and cc file, and that we can have a tablegen backend for the boilerplate. This would also be a handy place to define the transformations that are implemented by the dialect (I think).