LLVM-MOS Backend

Hello LLVM discourse!

I’m the main codegen developer of LLVM-MOS, an out-of-tree LLVM backend for the 6502:


We’re still a ways away from our first official release, but it’s now starting to get about that time, so I wanted to touch base with the broader LLVM community about what we’ve built, how strange it is, and start a converstation about whether it (eventually) belongs in the monorepo.

The compiler is fairly C99-complete, lacking only floating point. I’ve gone through the llvm-test-suite and gotten every SingleSource/ test I could manage to fit in 64k passing. Quite a few of these required 32-bit integers; I converted those tests to accept 16-bit ints where feasible, and disabled the rest.

From a code generation point of view, the project’s driving motivation is to apply LLVMs tremendous analysis and optimization capabilities to the very difficult challenge of generating good code for the 6502.

We’re not finished optimizing it by a long shot (even to the point suitable for a first release), but the results we’ve obtained so far are quite promising.

For those unfamiliar with the 6502, it’s a challenging architecture:

  • An 8-bit accumulator and two 8-bit index registers, each special-purposed for different instructions.
  • An 8-bit hardware stack, suitable for nothing more than return addresses and a few bytes of temporaries.
  • One practical indirect addressing mode that could be used to access a simulated soft stack, which is less than half the speed of an absolute memory access.
  • 8-bit indexed addressing modes, while the pointer size is 16-bits. The 8-bit index is silently zero-extended before adding.
  • Fast zero page (first 256 bytes of RAM) memory addressing modes.

We’ve dealt with these problems by using LLVM in a fairly unorthodox ways. Our two strategies are:

  1. Make the 6502 more “normal”:
  • A zero page region is used as a bank of “imaginary” general purpose registers. With this, more-or-less able to steal the RISC-V calling convention.
  • Rewrite the 6502 instructions set into a “logical” instruction set made of pseudos. For example, the three separate immediate load opcodes, LDA #12, LDX #12, and LDY #12, can all be collapsed into a logical LDImm opcode: %0:GPR = LDImm 12. This allows regular register allocation techniques to mostly work on the 6502.
  • Simulate a soft stack using a pointer. Standard practice for 6502 C compilers.
  1. Make LLVM more “weird”:
  • A whole program analysis walks the call graph and identifies functions that cannot have more than one invocation simultaneously active. These functions have their stack frames allocated as global arrays instead of on the soft stack. The downside is that asynchronous entries into LTO units must be annotated as such. Otherwise, e.g. leaf functions would always use this optimization, even if called both from main and from an interrupt handler.
  • I’ve had to do unspeakable things to Loop Strength Reduction to make the 6502’s addressing modes representable. LSR’s Formula data model assumes that all components of the sum that make up an addressing mode have the same width, but on the 6502, they generally have different widths, with the smaller components being implicitly zero-extended.
  • Extremely unusual legalization patterns depending heavily on the quality of lowering G_SELECT to branches. Most multi-byte legalizations on the 6502 involve branching, since branching is extremely cheap on the 6502, since there’s no cache and no pipeline.
  • Who knows, there may be more!

So, given all this, I’d like to open up discussion on whether or not this belongs in LLVM proper. While it’s true that this effort has resulted in making parts of LLVM more general, it’s unclear to me whether or not that’s a direction in which LLVM would actually like to go.

Would having a cool 6502 backend be a feather in LLVM’s cap? A pointless burden? Somewhere in between?

It’s also very likely that I’ve implemented something in a ridiculous way. Please don’t be shy about telling me so; I’m a relative newcomer to compiler development!

It’s been a wild ride getting from there to here, and there’s still a long ways to go. Kudos to all those who helped make LLVM what it is; our little backend stands on the shoulders of giants!


Just a fly-by comment with not much depth, but this is very, very cool! Like so many of my generation, I learned 65xx machine code in the eighties on a Commodore 64 and I am still involved in various 65xx related retro projects involving the micro-KIM and such. So, needless to say, I am very excited that I can now use LLVM in that context too!

1 Like

The first optimization I recommend on 6502/6510 is to convert multibyte data structures into multiple single-byte data structures. This allows the elimination of indexed indirect addressing when dealing with small arrays not to mention no having to scale the y index to access all elements. In other words, make byte planes whenever possible so you can use direct indexed memory accesses instead of indirect zero page base pointers.

This architecture has neither caches nor pipelining so setting up setter and getter methods to self-modify the high-byte of a direct indexed access as an alternative to using zero-page registers. (Yes, zero-page should be used to hold pointers when it can’t otherwise be avoided.)

In the what’s-been-done-before column, the author of the Contiki operating system treated the x index as the high byte of the accumulator when doing 16-bit math. This keeps the y index available for indexing when using the indexed indirect addressing mode. (Especially when indirect indexed addressing mode is so seldom used in comparison.) Of course, if doing single byte ops, freeing up the x index allows a separate index value for direct indexing for use as a destination index if y is the source, as Volker Barthelman, the author of VBCC figured out in his 65xx backend.

The source codes of the two compilers mentioned above are available but VBCC has stipulations on commercial usage, if I recall. Depending on your usage, using the documentation from those might be enlightening. I don’t remember if VBCC is publicly available on 65xx yet but Volker posted about it on a forum.