<div dir="ltr">I haven't seen what you are doing, but if I was writing a back end for the 6502, I'd lie to LLVM and describe RAM page 0 as being the real registers, and A, X and Y as being special purpose registers used for temporaries.<div><br></div><div>If your code is dealing with 8 bit values then you can keep a value in A for some time, but if there are 16 bit variables then you have no choice but to compile a = b + c + d into sequences like</div><div><br></div><div>clc</div><div>lda 10</div><div>adc 20</div><div>sta 40</div><div>lda 11</div><div>adc 21</div><div>sta 41</div><div>clc</div><div>lda 40</div><div>adc 30</div><div>sta 40</div><div>lda 41</div><div>adc 31</div><div>sta 41</div><div><br></div><div>(assuming a, b, c, d are stored in RAM at 40-41, 30-31, 20-21, and 10-11.)</div><div><br></div><div>I don't think there is any way you can do better by adding all the low bytes together .. is there? You'd have to save the carries and add them one at at time.</div><div><br></div><div>Hmm.</div><div><br></div><div>clc</div><div>lda 10</div><div>adc 20</div><div>php</div><div>clc</div><div>adc 30</div><div>sta 40</div><div>lda 11</div><div>adc 21</div><div>plp</div><div>adc 31</div><div>sta 41</div><div><br></div><div>Interesting .. saves two instructions (12 vs 14), six bytes (20 vs 26), seven clock cycles (35 vs 40).</div><div><br></div><div>But I'm not sure it's worth the middle end of LLVM knowing about this. Better to treat it as a 16 bit machine and use the actual 6502 register only in a kind of macro way in code generation?</div><div><br></div><div>Also, this kind of code is simply *big*, and on a machine with small memory. A typical RISC with 32 bit instructions does this in 8 bytes and two instructions, and thumb does it in 6 bytes and three instructions...</div><div><br></div><div>Have you looked at compiling most code to Sweet16 (or an updated version, but Woz did a nice job on that), and only innermost loops to real code?</div><div><br></div><div><a href="http://www.6502.org/source/interpreters/sweet16.htm">http://www.6502.org/source/interpreters/sweet16.htm</a><br></div><div><br></div><div>This interacts well with native code -- it's easy to pop into Sweet16 and out again.</div><div><br></div><div>Example use of Sweet16:</div><div><br></div><div><div>300  B9 00 02           LDA   IN,Y     ;get a char</div><div>303  C9 CD              CMP   #"M"     ;"M" for move</div><div>305  D0 09              BNE   NOMOVE   ;No. Skip move</div><div>307  20 89 F6           JSR   SW16     ;Yes, call SWEET 16</div><div>30A  41         MLOOP   LD    @R1      ;R1 holds source</div><div>30B  52                 ST    @R2      ;R2 holds dest. addr.</div><div>30C  F3                 DCR   R3       ;Decr. length</div><div>30D  07 FB              BNZ   MLOOP    ;Loop until done</div><div>30F  00                 RTN            ;Return to 6502 mode.</div><div>310  C9 C5      NOMOVE  CMP   #"E"     ;"E" char?</div><div>312  D0 13              BEQ   EXIT     ;Yes, exit</div><div>314  C8                 INY            ;No, cont.</div></div><div><br></div><div>An alternative would be to compile inner loops to native code and everything else to a pseudo register machine (probably similar in concept to Sweet16) but using indirect threaded code. i.e. The "code" is a list of addresses of functions, and the first two bytes of each function is the address of the "interpreter" for that function. For native functions, the interpreter is the function itself i.e. the first two bytes of the function point to the 3rd byte of the function, not to an interpreter.</div><div><br></div><div>Of course, all this assumes that you can compile to native code in the first place, even if it is big. So that's a good first step.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Feb 12, 2016 at 12:39 PM, N. E. C. via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Greetings, LLVM devs,<br>

<br>

For the past few weeks, I have been putting together a 6502 backend for LLVM.<br>

The 6502 and its derivatives, of course, have powered countless microcomputers,<br>

game consoles and arcade machines over the past 40 years.<br>

<br>

The backend is just an experimental hobby project right now. The code is<br>

available here: <<a href="https://github.com/beholdnec/llvm-m6502" rel="noreferrer" target="_blank">https://github.com/beholdnec/llvm-m6502</a>>. This branch<br>

introduces<br>

a target called "m6502", which can be used in llc to compile some very simple<br>

functions. Only a few instructions are implemented, it's not useful for anything<br>

yet.<br>

<br>

There was another attempt in August of last year by c64scene-ar on GitHub to<br>

design a 6502 backend, however, the project appears to be stalled with no<br>

substantial progress. As far as I know, my backend is the only one able to<br>

generate 6502 instructions.<br>

<br>

Here is a test file: <<a href="https://gist.github.com/beholdnec/910eba79391bb24ba2fa" rel="noreferrer" target="_blank">https://gist.github.com/beholdnec/910eba79391bb24ba2fa</a>>.<br>

<br>

I would like to ask for help as I'm stuck on one particularly sticky problem.<br>

I'll describe the problem shortly.<br>

<br>

Occasionally, the topic of a 6502 backend comes up on this mailing list. Here<br>

is an old thread talking about some of the challenges involved:<br>

<<a href="https://groups.google.com/forum/#!topic/llvm-dev/w37MfNU_Ag8" rel="noreferrer" target="_blank">https://groups.google.com/forum/#!topic/llvm-dev/w37MfNU_Ag8</a>>.<br>

<br>

The 6502 has only three 8-bit registers: A, X and Y, and 256 bytes of hardware-<br>

supported stack. Generating code for such a constrained system pushes LLVM to<br>

its limits.<br>

<br>

For one thing, LLVM couldn't figure out how to lower an ADD instruction that<br>

added a reg to a reg. The 6502's ADD instruction can only add register A to an<br>

immediate or a value loaded from memory. There is no instruction that adds A to<br>

another register.<br>

<br>

I had thought LLVM would allocate a stack object for the second operand, but<br>

it didn't, and LLVM threw an ISel matching error. I currently solve this with<br>

a custom ADD lowering function, see LowerADD in M6502ISelLowering.cpp.<br>

Question: Is custom lowering ideal for this situation? Or, is there another way<br>

to coax LLVM into recognizing ADD?<br>

<br>

The problem I'm stuck on is folding memory operands. In the test file above,<br>

in @testSum, switch %a, %b to %b, %a. llc will assert in Register Spilling:<br>

"Remaining use wasn't a snippet copy". Debug output shows STRabs being<br>

generated,<br>

followed by an attempted fold of a stack-load into ADDabs.<br>

<br>

I must be on the wrong track in M6502InstrInfo::foldMemoryOperandImpl. If<br>

someone could please explain this error, it would really help. Thanks!<br>

<br>

- Nolan<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div><br></div>