Authors: @ftynse, @stellaraccident, @mehdi_amini
Context: [RFC] Starting in-tree development of python bindings
Introduction
Following up on several discussions, we propose to reboot the in-tree development of C APIs to access core MLIR functionality (generic IR creation, inspection and modification, Pass management). The APIs will focus on low-level “generic” components and will not be concerned with dialect-specific operations, types or attributes. They are intended as building blocks for further bindings in different languages, which are expected to use these core components and provide wrappers around them following language-specific idioms. APIs for dialect-specific IR components can be based, e.g., on custom ODS backends and will be subject to separate proposals.
Scope
Initial implementation will expose creation and inspection methods for Attribute
, Block
, MLIRContext
, Operation
, Region
, Type
, as well as for parsing and printing. Attribute
and Type
will be created by parsing from strings. Operations will be created using an equivalent of OperationState
, e.g. a full list of operands, results, attributes, successors and regions. Further extension will include creation and inspection of Standard types and attributes, as well as guidance for exposing dialect-specific constructs.
Connection to ODS is out of scope and will be discussed separately.
Type Model
Core types will be exposed as type-punned opaque structures, i.e. empty named structures declared in headers and defined in a source files (hence the opacity) that are only transmitted by-pointer. They will actually point to the corresponding C++ object. Since the users of the C APIs don’t have access to the type definition, they will be unable to dereference the pointer and violate the strict aliasing rules.
Memory and Ownership Model
All IR objects are allocated and deallocated in C++, using appropriate mechanisms. C types come with two flavors: owning and non-owning, identified by their name. Objects of owning C types are produced by creation functions and are expected to either be transferred to another owned object, or destroyed from C. They cannot be inspected, but can be passed to the “getter” method that returns an equivalent inspectable non-owning object. Calls to any function that accepts an owning type except the “getter” function are assumed to transfer ownership of the object to the callee. It is impossible to construct an owning object from a non-owning object.
Example:
// Create two blocks that the caller owns.
mlirOwningBlock ownedBlock1 = mlirBlockCreate();
mlirOwningBlock ownedBlock2 = mlirBlockCreate();
// Get a non-owning view of the block for inspection,
// the caller retains ownership of the block. Cannot
// use ownedBlock1 with mlirBlockSize.
mlirBlock block1 = mlirBlockGet(ownedBlock1);
printf("%d", mlirBlockSize(block1));
// Attach block to a region. The ownership is transferred
// to the callee. Cannot use block1 here.
mlirRegionAppendBlock(region, ownedBlock1);
// Destroy the block we still own. No need to destroy
// ownedBlock1 because we don't own it anymore.
mlirBlockDestroy(ownedBlock2);
Note that C++ classes have implicit ownership semantics, except for constructing an operation with regions that takes unique_ptr<Region>
.
Iterable Objects
Non-owning iterable objects that can be indexed can be accessed through a combination of two functions: “size” and “at(index)”, with names following the C++ names.
Example:
unsigned i, n = mlirOperationGetNumOperands(operation);
for (i = 0; i < n; ++i)
mlirOperationGetOperand(operation, i);
Non-owning iterable objects based on linked lists can be accessed through a combination of “getFirst” and “next” functions, which return NULL
to indicate no further objects are available in the list.
Example:
mlirOperation op = mlirBlockGetFirstOperation(block);
for ( ; op; op = mlirOperationGetNextInBlock(op))
// use op.
Open questions
Naming scheme (bikeshed!)
Current proposal is to prefix all function and type names with mlir
, leading to camelBack
being used for both types and operations. Functions that logically correspond to methods on a given object are prefixed with the object’s type and take this object as the first argument, except for creation functions.
Examples:
- Type:
mlirOperation
- Owning counterpart:
mlirOwningOperation
- Function taking an object of this type as first argument:
mlirOperationGetNumOperands
Directory Tree and Build
Current proposal is to put C APIs under {mlir/include,mlir/lib,mlir/test}/mlir-c
and compile as a separate target.
Rationale: by placing API implementations in a separate library, we make sure they remain non-intrusive to core C++ APIs (e.g., we don’t expect files in lib/IR
to include headers from include/mlir-c
) and can reuse common implementation details.
C API library can be built and tested by default since the C++ compiler can usually compile C and is already configured in the build.
Existing include/mlir-c
We are unaware of any users of this code, so it will be removed.