LLVM Discussion Forums

Study route of MLIR python bindings

Hi all,

I am willing to attend the MLIR projects of Google Summer of Code 2020.
After learning the concept of MLIR and viewing all the open projects, I’m interested in the MLIR python bindings project.
So could I have a study route or suggestion for the MLIR python bindings project? And what should I prepare for the project before the application period of GSoC 2020?

Thanks!

Hi,

Great!

On the path to get started with this project, I can think of two items:

  • understanding MLIR concepts: the Toy tutorial is likely a good start.
  • getting to understand how python bindings works in general and the possible options. LLVM is exposing a C API that is then wrapped in python using ctypes. But there are other possibilities, for example LLDB is using swig. More recent approaches include clif and pybind11.

I’m interested in getting a good comparison of these options before moving forward with one, and I’m sure there are people on the community who have experience with some of these frameworks and have opinions on the pros/cons.

Thanks for your suggestion!
Actually, I’m following the Toy tutorial to understand MLIR concepts. Then I will try to understand those approaches of python bindings and figure out the difference between them. If there are further questions, I will discuss them in the community.

Thanks for your interest!

Please follow what @joker-eph suggested. @nicolasvasilache and I are listed as mentors for the project, don’t hesitate to ping us should you decide to work on this project.

A couple of other pointers: at some point, we explored using pybind11 to have Python APIs closer to the C++ ones, but it was focused on specific parts of MLIR and we did not push it further. The main hurdle for constructing the IR is the templated Op::build APIs that are non-trivial to replicate (on one hand, we don’t want to have to write new bindings code for every operation, on the other hand, using the “generic” Operation seems to verbose).

Also, I’ve stumbled upon https://github.com/spcl/pymlir, which does not seem to be bindings but an MLIR parser implemented in Python.

Thanks for your response!

I will try my best to get familiar with the corresponding knowledge about pybind11 and pymlir. If there are questions or ideas, I’m going to discuss them in the community. And the moment I’m ready for the project, I will contact you.

Hi @ftynse,

After having quick learning about Toy tutorial and pybind11, I come to understand the main hurdle for constructing the IR. In my opinion, when we define a Dialect, there are lots of corresponding operations. And for each operation, we should implement the templated Op::build APIs. As for the python bindings file, we should create a binding for each Op::build API in the PYBIND11_MODULE, which causes duplication of work.

Is my understanding correct? If I catch the point, how could I learn more about it? And I can’t find MLIR python bindings examples in the llvm-project, could I have a demo to try out the python bindings?

Thanks!

Yes, your evaluation goes in the right direction. Op::build methods are different for all operations and, furthermore, they are called indirectly through OpBuilder::create function template. Users are not expected to call *Op::build APIs themselves. In C++, we rely on templates to forward arguments from OpBuilder::create to the relevant Op::build function, but it is unclear how this can be achieved in Python.

Also, many of the Ops are generated from ODS (https://mlir.llvm.org/docs/OpDefinitions/), which we could try and use for generating Python bindings as well.

And I can’t find MLIR python bindings examples in the llvm-project, could I have a demo to try out the python bindings?

We only explored it, there is no publicly available code for the bindings, hence the open project. :wink:

Please consider also looking at different ways of exposing the bindings as @joker-eph mentioned

LLVM is exposing a C API that is then wrapped in python using ctypes. But there are other possibilities, for example LLDB is using swig. More recent approaches include clif and pybind11.

we experimented with pybind11 because that was the one we knew best.

I am so excited I find the right way! I also found that the ODS framework supports Dialect Operations with TableGen. And I’m considering how to emit the python bindings automatically, which is equivalent to OpBuilder::create and mlirGen().

I will keep learning different bindings to explore how to solve the problem. And could I apply for the GSoC 2020 with this project?

As Mehdi mentioned before, we would like to have a rationale on choosing which bindings library to use, and whether or not to generate the op-level bindings. (Personally, I would consider starting with the generic IR concepts such as Type and Operation before tacking Op-specific constructors). There are further potential issues with auto-generating bindings that I’ll let you discover by looking at how ODS works.

It is a bit too early to talk about GSoC. We will have to wait for Google to approve the participation of the LLVM organization in GSoC and allocate (or not) the slots to the organization. Then we will need to decide, within LLVM, how many of the slots are available for MLIR-related projects and ultimately which projects and candidates to accept. If the LLVM organization is accepted, the student application period will open on March 16. You can find a more detailed timeline here https://summerofcode.withgoogle.com/how-it-works/. Personally, I will prefer to endorse an applicant who already contributed to the project before the application is due.