[PATCH] D61259: AArch64: support arm64_32, an ILP32 slice for watchOS.

Mon Jul 8 03:09:31 PDT 2019

t.p.northover marked 6 inline comments as done.
t.p.northover added a comment.

> At a high level, my main care about here is that what is upstreamed here doesn't conflict with also adding support at some point in the future for the AArch64 ILP32 ABI - for which a beta quality specification from Arm exists.

I think most of the changes should be reusable.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:9747
+    if (Res.getOpcode() == ISD::AssertZext)
+      Res = Res.getOperand(0);
+
----------------
efriedma wrote:
> Does this have any practical effect for other targets?
I don't think so. It's a correctness issue so if other targets were affected then they'd start seeing vregs without definitions and things.

I believe the driving arm64_32 feature is the fact that pointers are extended beyond 32-bits by the **caller**, which leads to some weird (but valid) DAG nodes trying to communicate that fact. Other targets only assertzext from invalid types.

================
Comment at: llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp:1131-1132
   RegisterAsmPrinter<AArch64AsmPrinter> Z(getTheARM64Target());
+  RegisterAsmPrinter<AArch64AsmPrinter> W(getTheARM64_32Target());
+  RegisterAsmPrinter<AArch64AsmPrinter> V(getTheAArch64_32Target());
 }
----------------
kristof.beyls wrote:
> I assume this patch introduces support for the Darwin-specific arm64_32 ILP32 ABI.
> That makes me wonder why there is a need to define/create the "getTheAArch64_32Target()". Doesn't "getTheARM64_32Target()" suffice to enable support for the Darwin-specific arm64_32 ILP32 ABI?
I think it would be enough for production purposes, but the extra lines allow someone to write triple strings starting with `aarch64_32` in tests and so on if they want to.

I still have a long-term goal to switch Clang over to using `aarch64` as the canonical name in LLVM (`aarch64-apple-ios14.0` etc). I was prevented when I tried years ago because ld64 wasn't ready, but I think I fixed that and I should try again some time.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:3736

-  SmallVector<std::pair<unsigned, SDValue>, 8> RegsToPass;
+  std::map<unsigned, SDValue> RegsToPass;
   SmallVector<SDValue, 8> MemOpChains;
----------------
efriedma wrote:
> What's changing here?  Does it make sense to add any comments?
We use the ability to lookup whether a register has already been added (use at line 3801 in this diff). If it has then we're trying to combine two parts of (say) a `[2 x i32]` into a single register for compatibility with armv7k IR and need to do the bitwise arithmetic to make that happen.

I don't know that a comment here would work (it would either be a historic note, or pre-empting what comes later). I'll try to do something to call it out unobtrusively at the use-point.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.h:258-260
+  MVT getPointerTy(const DataLayout &DL, uint32_t AS = 0) const override {
+    return MVT::getIntegerVT(64);
+  }
----------------
kristof.beyls wrote:
> I found it surprising that this method doesn't loop for whether ILP32 is targeted or not to decide whether the pointer type is a 32 or 64 bit integer.
> Would it be worthwhile to add a small comment here why the integer pointer type is always 64bits?
> Maybe I'm missing something completely trivial though...
A comment would definitely be worthwhile. This is the function that enables the large change I upstreamed earlier so that pointer types in the DAG can remain 64-bits (to exploit addressing-modes available) and get truncated to 32-bits in memory.

================
Comment at: llvm/test/CodeGen/AArch64/tail-call.ll:4
 declare fastcc void @callee_stack0()
-declare fastcc void @callee_stack8([8 x i32], i64)
-declare fastcc void @callee_stack16([8 x i32], i64, i64)
+declare fastcc void @callee_stack8([8 x i64], i64)
+declare fastcc void @callee_stack16([8 x i64], i64, i64)
----------------
efriedma wrote:
> What are you trying to do here?
These functions are using the arrays purely to consume register space during a call, but after this change `[8 x i32]` only uses x0-x3 (in full). It's only an IR-level change (i.e. it doesn't affect C or C++ ABI), but I'll limit it to the arm64_32 target when I update the diff.

================
Comment at: llvm/test/CodeGen/AArch64/win64_vararg.ll:271
+; CHECK: mov     x1, [[REG2]]
+; CHECK: mov     x3, [[REG3]]
 ; CHECK: str     x30, [sp, #16]
----------------
efriedma wrote:
> There isn't any obvious reason for this test to change?
I think it's because of the `std::map` change you called out above. It implicitly sorts the list of registers that get copied, perturbing the DAG and scheduling.

I think I'll switch it back to `SmallVector` and use `std::find_if` to handle the (rare) ARM compatibility instead. It ought to be faster in the common case and won't have this side-effect.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61259/new/

https://reviews.llvm.org/D61259