Building tf-opt and mlir-opt from the iree hierarchy?

This question may be a bit off-topic, as it concerns the bazel build process. However, it concerns a question of economy (time and disk space) that seems important in the MLIR ecosystem.

I currently have one separate copy of llvm-project, one of tensorflow, and one of iree, the latter making their own copies of both llvm-project and tensorflow.

My question is the following: Is it possible, using bazel commands, to build tf-opt and mlir-opt from inside the iree repository?

There are folks around here who are bazel experts (I am not one of them), but I think I can answer this.

In iree, there are a few bazel workspaces involved. Which ones you can access are dependent on where you are running bazel from.

If you are running bazel from the root of the tree, you are in the iree_core workspace, and unqualified targets (i.e. //iree/tools:iree-opt) reference targets in @iree_core. From here, you can also reference LLVM and MHLO targets via the workspace aliases @llvm-project (i.e. @llvm-project//mlir:mlir-opt) and @mlir-hlo (i.e. @mlir-hlo//:mlir-hlo-opt).

If you are running bazel under the IREE integrations/tensorflow directory, then the @org_tensorflow workspace will also be mapped in and you access the main IREE code via the workspace alias @iree.

So if running bazel under integrations/tensorflow, I believe the following would get you what you want, give or take typos:

  • bazel run @org_tensorflow//tensorflow/compiler/mlir:tf-opt
  • bazel run @llvm-project//mlir:mlir-opt

If running bazel at the root of the IREE repo, only the second one would work (since IREE’s core/root workspace does not depend on TensorFlow).

Unlike TensorFlow, we pin all of these deps to submodules in third_party (versus having sources fetched opaquely by bazel) and then do Bazel shenanigans to link it all together into one workspace graph.

If in doubt, look at BUILD files in the tree you are in and you will see fully qualified targets like the above. Those can also be used from the command line so long as you are in a a sub-directory that roots on the same WORKSPACE file.

If doing TensorFlow centric work in IREE, work out of the iree/tensorflow directory. Otherwise, work out of the project root (much smaller dependency graph).

1 Like

Thanks a lot!

Just tried it, it works. BTW, the experience of building is very different if I work at the root of the iree repo, or in integrations/tensorflow:

  • When building llvm-project from the iree root, build starts right away in the same bazel folder, and it seems to reuse files compiled during the iree build.
  • When building tensorflow under integrations/tensorflow, a new bazel folder is created, and it seems to recompile everything from scratch.

Keep in mind that those are actually two different projects (iree itself and integrations/tensorflow) that just happen to be living in the same repo for historical reasons. Iree itself tries to keep its dependencies and layering under control and is where we do core development. The tensorflow frontends necessitate a dependency on tensorflow and that project depends on both tensorflow and IREE.

The tensorflow bazel build is monolithic and very bloated. Outside of the frontends, we do not take a dep on it, as even referencing it places a large burden on the build system and makes the software less portable. Afaik, there is no way to share a bazel root between projects. You can setup a bazel disk cache and that might add some incrementality, but project switching isn’t really something we optimize for. Most of the iree devs who aren’t working on tensorflow frontends specifically just build the frontends once (or install a binary) and never touch that side of it unless if something changes.

1 Like

Everything Stella said is correct. To add a bit more detail, we define an entirely separate WORKSPACE under integrations/tensorflow that brings in TF and all its deps and IREE and all its deps. This means that Bazel running here will start a separate Bazel server instance with separate output base. I think it’s maybe possible to force Bazel to reuse the same output base with the --output_base flag, but I have no idea if this would actually work or be helpful at all. You’re basically telling Bazel to build two separate projects as if they were the same project and I think you’d just thrash the cache.

You can use a shared --disk_cache (personally I define a global disk_cache in ~/.bazelrc), but I’m not sure if you’d actually get cache hits project to project, since Bazel is usually pretty picky about ensuring it only gives cache hits when you really are calling exactly the same command. As an aside, since you’re trying to build TF with Bazel, we’ve found that Bazel’s symlink-based sandboxing scales poorly with core count, so throwing build --sandbox_base=/dev/shm in your ~/.bazelrc helps if you’re trying to build with a ton of cores (which is the only way I’ve successfully built TF locally). See Bazel deadlocks hosts with large numbers of cores · Issue #11868 · bazelbuild/bazel · GitHub.

You’ll also notice that in integrations/tensorflow/WORKSPACE we define all the TF dependencies first, allowing them to take precedence, including @llvm-project, which means that it’s not using IREE’s submodule. Because we use a version of TF that points to the same version of LLVM as IREE uses, these should end up being the exact same files if you have a clean submodule state, but the route for getting them is totally different. Also the build files for them are totally different because TF has its own copy of LLVM build files. This proliferation of build files is something I’m trying to address with the (now accepted) proposal to put Bazel build files in a side directory in the LLVM monorepo. One of the reasons integrations/tensorflow is a separate workspace is that taking a library/build dependency on TF is super painful, so we instead try to flip the dependency and basically make this a TF build with a dependency on IREE. That’s the theory at least. We’re not all the way there and in particular my attempt to use TF’s toolchains ended up breaking local builds for reasons that remain a mystery to me. I think we could probably make that build use IREE’s submodule, though that hasn’t been something I’ve focused on.

To Stella’s point about what IREE devs actually do, basically the entire integrations/tensorflow workspace is only to build the 5 binaries defined in iree_tf_compiler. Because those are mostly for a python integration (and we don’t attempt to build Python with Bazel) you can’t even really make it any further than that in Bazel land anyway. We have pip packages, that we’re working on productionizing, to distribute those binaries. So the integrations/tensorflow Bazel build really only aims to support that release and those actively working on IREE’s TF integration.

One thing you could do, however, if you’re worried about having several separate checkouts of TF, is change into IREE’s TF submodule and run Bazel from there. It will again be a separate Bazel root and such and TF will still fetch its own copy of LLVM.

1 Like