[PATCH] D108643: Introduce _BitInt, deprecate _ExtInt

Tue Sep 14 15:13:53 PDT 2021

erichkeane added a comment.

In D108643#3000540 <https://reviews.llvm.org/D108643#3000540>, @rjmccall wrote:

> The question is whether you can rely on extension at places that receive an arbitrary ABI-compatible value, like function parameters or loads.  If nobody who receives such a value can rely on extension having been done, then the ABI is not meaningfully "leaving it unconstrained": it is constraining some places by the lack of constraint elsewhere.  That is why this is a trade-off.

I see, thanks for that clarification!

> Okay, but this is a general-purpose feature being added to general-purpose targets.  Clang cannot directly emit AMD x86-64 microcode, and I doubt that an `and; and; cmp` instruction sequence gets fused into a masked compare in microcode.  Those masked comparisons are probably just used to implement 8/16/32-bit comparisons, selected directly on encountering a compare instruction of that width.

I don't work on the microcode, it is just what I was told when we asked about this.  SO until someone can clarify, I have no idea.

> This still doesn't make any sense.  If you're transmitting a `_BitInt(17)` as exactly 17 bits on a dedicated FPGA<->x86 bus, then of course continue to do that.  The ABI rules govern the representation of values in the places that affect the interoperation of code, such as calling conventions and in-memory representations.  They do not cover bus protocols.

Again, it was an argument made at the time that is outside of my direct expertise, so if you have experience with mixed FPGA/traditional core interfaces, I'll have to defer to your expertise.

> This entire discussion is about what the ABI rules should be for implementing this feature on general-purpose devices that doesn't directly support e.g. 17-bit arithmetic.  Custom hardware that does support native 17-bit arithmetic obviously doesn't need to care about those parts of the ABI and is not being "punished".  At some point, 17-bit values will come from that specialized hardware and get exposed to general-purpose hardware by e.g. being written into a GPR; this is the first point at which the ABI even starts dreaming of being involved.  Now, it's still okay under a mandatory-extension ABI if that GPR has its upper bits undefined: you're in the exact same situation as you would be after an addition, where it's fine to turn around and use that in some other operation that doesn't care about the upper bits (like a multiplication), but if you want to use it in something that cares about those bits (like a comparison), you need to zero/sign-extend them away first.  The only difference between an ABI that leaves the upper bits undefined and one that mandates extension is that places which might expose the value outside of the function — like returning the value, passing the value as an argument, and writing the value into a pointer — have to be considered places that care about the upper bits; and then you get to rely on that before you do things like comparisons.
>
> Again, I'm not trying to insist that a mandatory-extension ABI is the right way to go.  I just want to make sure that we've done a fair investigation into this trade-off.  Right now, my concern is that it sounds like that investigation invented a lot of extra constraints for mandatory-extension ABIs, like that somehow mandatory extension meant that you would need to transmit a bunch of zero bits between your FPGA and the main CPU.  I am not a hardware specialist, but I know enough to know that this doesn't check out.

Again at the time, my FPGA-CPU interconnect experts expressed issue with making the extra-bits 0, and it is filtered by my memory/ the "ELI5" explanation that was given to me, so I apologize it didn't come through correctly.

> I have a lot of concerns about turning "whatever LLVM does when you pass an i17 as an argument" into platform ABI.  My experience is that LLVM does a lot of things that you wouldn't expect when you push it outside of simple cases like power-of-two integers.  Different targets may even use different rules, because the IR specification doesn't define this stuff.

That seems like a better argument for leaving them unspecified I would think.  If we can't count on our backends to act consistently, then it is obviously going to be some level of behavior-change/perf-hit to force a decision on them.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108643/new/

https://reviews.llvm.org/D108643