[RFC] Unified CMake builds (for Python at least)

We had some previous discussion about “unified builds”, where LLVM and CIRCT are configured in one CMake invocation, and compiled at once: Improve the Circt build system · Issue #208 · llvm/circt · GitHub. Following that, we have supported a unified build approach using LLVM_EXTERNAL_PROJECTS. There are numerous benefits to this approach, as mentioned in the issue, which I have personally enjoyed.

To take advantage of the major improvements to the Python bindings being introduced in ⚙ D106520 Re-engineer MLIR python build support., we will be required to use unified builds, at least when building the CIRCT Python bindings. There was more discussion about this in Discord, for background.

I want to post here for visibility before just making a PR.

For people who are not building the Python bindings, I’m hoping to set this up so we can continue building CIRCT against a pre-built LLVM. Notably, that is how our CI is building CIRCT on each PR/commit to main, and I don’t want to change that (yet).

For people who are building the Python bindings, there are two approaches. For one, you can use LLVM_EXTERNAL_PROJECTS, as mentioned in the original issue. This is how I’ve been building CIRCT, and it works with the upstream improvements. The other option is to continue building the same way, but have our CMakeLists.txt include LLVM as a source-level dependency, rather than a library through find_package. This is how NPComp is being updated here. This approach can be taken only when Python bindings are enabled, so those who don’t care may still depend on LLVM as a library.

Either way, a unified build means we can no longer cache a pre-built LLVM for CI or local development. I have been using CCache for local development, and this was suggested for CI as well. There is a GitHub action for it, which is being used in an MLIR-based project already. Since the Python bindings are only tested nightly, my plan is to first update the nightly tests to work with one of the two options above, and follow up with CCache to hopefully speed them up.

Some concrete questions:

  • Is switching to a unified build a show-stopper of anyone using Python? I’d hope not, since this is how CIRCT will be built if it ever graduates from incubator into LLVM proper. CCache really helps for local development in this setup, if you aren’t using it already.
  • For people building Python bindings, is the LLVM_EXTERNAL_PROJECTS approach preferred, or is there a desire to have the other approach work, like how NPComp does it? That will take a little more CMake tweaking, but it shouldn’t be a problem. Both can work, but I’d like to pick one to document, test in the nightly CI, etc. My preference is to use LLVM_EXTERNAL_PROJECTS, since we already have two different ways to build CIRCT in CMakeLists.txt, and if we take the other route, now we will have three.
  • For people not building Python bindings, is there any desire to move towards a unified build for CIRCT? It’s been supported for a while, but it’s not what we document in the README, etc. I don’t want to force this yet, but I’m curious if there is any desire from the community to move this direction eventually. Perhaps after the Python setup is well tested, and we have a chance to test the CCache GitHub action, we could consider moving the documentation, CI, etc. to this approach.

Thanks, Mike. We’re switching npcomp to only support this today (the mechanic we’re using is slightly different due to more complicated project requirements but same approach) – as soon as I get the CI updated/simplified. We get fewer instructions, less things that need to line up, and much smaller builds overall.

Afaict, ccache for the ci is the way to go.

1 Like

I’ve pushed a draft PR here: [Python] Re-work Python bindings using upstream improvements. by mikeurbach · Pull Request #1484 · llvm/circt · GitHub. In its current state, it requires LLVM_EXTERNAL_PROJECTS when CIRCT_BINDINGS_PYTHON_ENABLED is set, since that was the simplest way to get a unified build working. I’ve updated the docs and integration tests with that approach, and I’m now manually running the integration tests. I included the ccache addition, which seemed simple enough.

I don’t much care either way, but I’m very concerned about compile times. If one is noticeably slower for CIRCT only changes, then I’m against it.

Regarding compile times:

On the PR, @jdd asked

What are you observing in terms of compile times? I’ve been avoiding unified builds (both in the integration tests and locally) since I assumed they would be slower. Presumably, I’ll need to switch over locally as well.

and above

If one is noticeably slower for CIRCT only changes, then I’m against it.

In short, I think a unified build is actually going to be the best case for compile times. This was addressed in the original issue by @compnerd:

For the concern of build times - the build can limit to exactly what you need to build rather than building all of LLVM if you are using a unified build. Because the project is built as a single entity, the build system is able to see a complete graph and only build the targets that are used.

When I first read that, I was dubious how much of an improvement it would be. But I made the switch, and I’ve been extremely happy I did. My productivity went up significantly.

Let me give some scenarios in my day-to-day with CIRCT and LLVM/MLIR:

If you’re just changing CIRCT, CMake/Ninja aren’t going to go and recompile LLVM at all. This is without ccache or anything like that. The build tool knows you just touched the CIRCT parts, and recompiles those, which is pretty much exactly as fast as with a standalone build that uses LLVM as a library.

If you’re working on a change that relies on something upstream, say in IRCore.cpp, you will recompile IRCore.cpp and exactly the targets that depend on it, which may take some seconds on a decent machine. That’s much nicer than the previous workflow I used, where I would fully recompile LLVM/MLIR, and have to sit there watching tons of targets I don’t care about being built and linked. This was the biggest productivity improvement from switching to unified builds for me.

If you’re working with different versions of LLVM, ccache starts to come into play. Say you’re iterating on some revision of upstream LLVM/MLIR that is roughly near main, but while waiting for review, you need to test an unrelated patch to CIRCT. If there are breaking changes to MLIR between the commit CIRCT has pinned and the commit you were basing your upstream patch on (pretty common in my experience), you’re going to need to reset LLVM back to the commit CIRCT has pinned in order to test your patch locally. This is where ccache shines. If you’ve configured it with enough space, and you are using the same CMake configuration, it should take seconds to minutes to recompile LLVM when switching back and forth.

In the scenarios when you fully re-compile LLVM, for example when pulling down the latest main, you are still going to see a better compile time, because “full re-compile” means exactly the targets you need (and have changed), not every target that is enabled by default.

So in summary, I’ve seen nothing but improvements after switching to unified builds last November. Because I sometimes juggle different LLVM version at once, I install ccache and give it 20G of space.

I have been using the LLVM_EXTERNAL_PROJECTS mechanism, because I also build several other MLIR-based CMake projects, and I enjoy being able to do a unified build for all of them. We could support something like what NPComp is now doing, but I don’t forsee a noticeable difference. From the build tool’s perspective, we are still going to compile the same targets either way. And CIRCT already supports LLVM_EXTERNAL_PROJECTS, since apparently others care about this mechanism as well.

Yeah, ditto to what Mike says. For me, the cost and disk thrash of installing all of the (unneeded but built) LLVM binaries that are along for the ride was an extreme penalty both in terms of e2e workflow time and the ability to edit anything and get a speedy incremental build.

On the npcomp and IREE side, we include the LLVM project as a sub-project (as opposed to through LLVM_EXTERNAL_PROJECTS). As Mike says, from the build graph perspective it is the same, but that mode is more suitable for standalone things (i.e. since ninja just builds your project, not the union of all projects). But it is easy to get that behavior from either mode.

Also, switching to this mode has some other softer benefits: it makes it worthwhile to trim dependencies and get a compile time saving (whereas if always building everything, there isn’t much motivation). I did set up the new way so that with a few tweaks, we can leave out most of LLVM building (i.e. all the codegen/JIT bits) from circt’s python bindings. I am betting this may produce a 70% savings. It won’t help npcomp (since it uses that stuff) but will probably help circt.

1 Like

I saw that! I haven’t done anything to poke at this yet, but looking forward to seeing how much savings we can get.

I need to split the execution engine from the core module. It was aggregated by mistake. Shouldn’t be a hard fix… just need the time.

OK, then I’m on board the unified build train!

Awesome, welcome aboard.

For the first pass at this, I will update the Python build documentation and nightly integration tests to perform a unified build with LLVM_EXTERNAL_PROJECTS. That’s an approach that already works for CIRCT and is the smallest change to support the new Python machinery. For now, our CMakeLists.txt will require this and error out if you try to build Python bindings without LLVM_EXTERNAL_PROJECTS. We can iterate and support other configurations later.