[PATCH] D61259: AArch64: support arm64_32, an ILP32 slice for watchOS.
Amara Emerson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 22 00:48:25 PDT 2019
aemerson added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64CallingConvention.cpp:115
- unsigned RegResult = State.AllocateRegBlock(RegList, PendingMembers.size());
- if (RegResult) {
+ unsigned EltsPerReg = (IsDarwinILP32 && LocVT.SimpleTy == MVT::i32) ? 2 : 1;
+ unsigned RegResult = State.AllocateRegBlock(
----------------
Comment here explaining why we need this? I'm guessing for passing arrays of i32s & compatibility with armv7k?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:3736
- SmallVector<std::pair<unsigned, SDValue>, 8> RegsToPass;
+ std::map<unsigned, SDValue> RegsToPass;
SmallVector<SDValue, 8> MemOpChains;
----------------
t.p.northover wrote:
> efriedma wrote:
> > What's changing here? Does it make sense to add any comments?
> We use the ability to lookup whether a register has already been added (use at line 3801 in this diff). If it has then we're trying to combine two parts of (say) a `[2 x i32]` into a single register for compatibility with armv7k IR and need to do the bitwise arithmetic to make that happen.
>
> I don't know that a comment here would work (it would either be a historic note, or pre-empting what comes later). I'll try to do something to call it out unobtrusively at the use-point.
Looks like we still need some form of documentation here at RegUsed use?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:3428
+ // Avoid copying a physreg twice since RegAllocFast is incompetent and only
+ // allows one use of a physreg per block.
+ SDValue Val = CopiedRegs.lookup(VA.getLocReg());
----------------
Why does this happen at all?
================
Comment at: llvm/test/CodeGen/AArch64/arm64-collect-loh.ll:163
define i64 @getSExtInternalCPlus4() {
- %addr = getelementptr i32, i32* @InternalC, i32 4
+ %addr = getelementptr inbounds i32, i32* @InternalC, i32 4
%res = load i32, i32* %addr, align 4
----------------
Can you add a comment here explaining that inbounds is needed for arm64_32 to produce the same code.
================
Comment at: llvm/test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll:22
; NONEFP: ldr h0,{{.*}}
-; NONEFP: fmov s1, wzr
-; NONEFP: fmov d2, xzr
-; NONEFP: movi{{(.16b)?}} v3{{(.2d)?}}, #0
-; NONE16: fmov h0, wzr
-; NONE16: fmov s1, wzr
-; NONE16: fmov d2, xzr
-; NONE16: movi{{(.16b)?}} v3{{(.2d)?}}, #0
+; NONEFP-DAG: fmov s1, wzr
+; NONEFP-DAG: fmov d2, xzr
----------------
It's odd that this test changed?
================
Comment at: llvm/test/CodeGen/AArch64/arm64_32-neon.ll:6
+; CHECK: mov.d v0[0], v1[0]
+ %res = insertelement <2 x double> %vec, double %val, i32 0
+ ret <2 x double> %res
----------------
Is anything really expected to change with NEON & arm64_32?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D61259/new/
https://reviews.llvm.org/D61259
More information about the llvm-commits
mailing list