Why is tinycc so much faster than clang -O0?

Clang being run at optimization level 0 is supposed to lead to no optimization being done.

Yet everything I’m seeing suggests it’s still roughly 5x to 10x slower than tinycc which is supposed to do the same. (See this benchmark as compare to GCC which is roughly as fast as LLVM: TCC : Tiny C Compiler)

I know Zapcc is essentially supposed to accomplish this and forks LLVM. But it’s only about 1.6 times faster according to this, which also backs up LLVM and GCC having similar performance: A Performance-Based Comparison of C/C++ Compilers | Colfax Research

Still seems like there is plenty of room to speed this up… any effort or plans on this front? Or should I use tinycc for fast compiles for debugging, then LLVM to compile the release version?

As far as I understand the Rust community has encountered some slowdowns which they could attribute to the underlying LLVM facilities.

See these two articles as examples:

https://blog.mozilla.org/nnethercote/2020/04/24/how-to-speed-up-the-rust-compiler-in-2020/

https://blog.mozilla.org/nnethercote/2020/08/05/how-to-speed-up-the-rust-compiler-some-more-in-2020/

So yes, there is still room for performance improvements. But you can also get some wins by building a Clang toolchain yourself with PGO. I.e. multi-staged build and collecting a profile in the first or second stage, which is then applied to the second or third respectively. I have done this on Linux, targeting a specific ARM core, and encountered quite some performance boosts just doing that.

Otherwise, it’s open source, so if you have encountered a particular bottleneck area you could report it (which also takes time and effort) or even start fixing it.

The community seems rather welcoming.

1 Like

PGO definitely seems worth checking out… I’m seeing it tends to only lead to roughly a 20% speedup max but worth looking at. Do you feel like you’ve done better than this?

Good idea to do some testing looking for bottlenecks.