ADT: StaticVector implementation

I have an implementation of llvm::StaticVector I’d like to upstream. Essentially, it would be llvm::SmallVector without the potential to grow beyond inline storage. This saves overhead in the class storage as well as trims any code generated related to growing the heap buffer.

This container is a small piece in a larger vision of seamlessly + efficiently interoperable bitsets with various underlying storage strategies. One flavor of bitset I think would be beneficial is an llvm::ContiguousBitset class, which is a container adapter that overlays a bitset API on top of any vector-like contiguous container. The llvm::SmallVector would be an important underlying container for this bitset adapter, but I believe the streamlined llvm::StaticVector would also be important (think machine model code where resources have compile-time constant limits).

The benefit of using a vector-like container as the underlying storage is that we can track the size separate from the buffer capacity. This means, that if you have a large number of trailing 0s after an operation, you can size the vector down to the last non-zero word and avoid walking those trailing 0s in the future.

In addition, there’s no reason the existing llvm::BitVector classes can’t be pulled into this common API and all efficiently interoperate with a new bitset concept.

Since I’m new to upstream LLVM (and LLVM in general) I was wondering where to start this process. I figured the llvm::StaticVector is a good place to start. I do have a patch ready to go… but would be willing to go through an RFC process if necessary.

All of this is something I’ve done behind a company firewall in the past. Just wondering if this is something the community would be interested in.

1 Like

Feel free to put the patch on Phabricator as a starting point for the discussion around StaticVector!

I’m curious about the use-case in the context of compilers though? I assume it would just crash the process if you try to push_back beyond the inline storage? So this is only ever applicable when you don’t know the size / it’s dynamic (otherwise you’d use std::Array) ; but not dynamic enough that you don’t have a guarantee’d upper bound.

The current implementation would assert() if you push_back() beyond the stated capacity, or read past the end for that matter. Whether an assert() is the right approach would be a topic to discuss during the code review.

Essentially its a std::array with the std::vector API over top of it to track the number of valid elements. When you instantiate a StaticVector, it can be empty (i.e. no element constructors are run). When you instantiate a std::array, all the element constructors are run. This is a useful property if you don’t have element constructor inputs available at the time you construct the container.

I was following the proposed std::static_vector spec when I initially implemented this privately, but pulled in all the SmallVector API enrichments for my current patch.

In terms of usage, I see this container used as the underlying storage for a bitset adapter container (think target-specific machine resources). I have this bitset layer implemented too, but I expect that to be more controversial given the current availability of llvm::BitVector containers.

Patch is up on Phabricator:
https://reviews.llvm.org/D112175

Thanks!