[llvm] [AArch64] Improve expansion of immediates of the form (~w << 32 | w). (PR #162286)
Ricardo Jesus via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 8 03:35:45 PDT 2025
================
@@ -239,6 +239,57 @@ static bool trySequenceOfOnes(uint64_t UImm,
return true;
}
+// Attempt to expand 64-bit immediate values whose negated upper half match
+// the lower half (for example, 0x1234'5678'edcb'a987).
+// Immediates of this form can generally be expanded via a sequence of
+// MOVN+MOVK to expand the lower half, followed by an EOR to shift and negate
+// the result to the upper half, e.g.:
+// mov x0, #-22137 // =0xffffffffffffa987
+// movk x0, #60875, lsl #16 // =0xffffffffedcba987
+// eor x0, x0, x0, lsl #32 // =0xffffffffedcba987 ^ 0xedcba98700000000
+// =0x12345678edcba987.
+// When the lower half contains a 16-bit chunk of ones, such as
+// 0x0000'5678'ffff'a987, the intermediate MOVK is redundant.
+// Similarly, when it contains a 16-bit chunk of zeros, such as
+// 0xffff'5678'0000'a987, the expansion can instead be effected by expanding
+// the negation of the lower half and negating the result with an EON, e.g.:
+// mov x0, #-43400 // =0xffffffffffff5678
+// eon x0, x0, x0, lsl #32 // =0xffffffffffff5678 ^ ~0xffff567800000000
+// =0xffffffffffff5678 ^ 0x0000a987ffffffff
+// =0xffff56780000a987.
+// In any of these cases, the expansion with EOR/EON saves an instruction
+// compared to the default expansion based on MOV and MOVKs.
+static bool tryCopyWithNegation(uint64_t Imm,
+ SmallVectorImpl<ImmInsnModel> &Insn) {
+ // We need the negation of the upper half of Imm to match the lower half.
+ // Degenerate cases where Imm is a run of ones should be handled separately.
+ if ((~Imm >> 32) != (Imm & 0xffffffffULL) || llvm::isShiftedMask_64(Imm))
+ return false;
+
+ const unsigned Mask = 0xffff;
+ unsigned Opc = AArch64::EORXrs;
+
+ // If we have a chunk of all zeros in the lower half, we can save a MOVK by
+ // materialising the negated immediate and negating the result with an EON.
+ if ((Imm & Mask) == 0 || ((Imm >> 16) & Mask) == 0) {
+ Opc = AArch64::EONXrs;
+ Imm = ~Imm;
+ }
+
+ unsigned Imm0 = Imm & Mask;
+ unsigned Imm16 = (Imm >> 16) & Mask;
+ if (Imm0 != Mask) {
+ Insn.push_back({AArch64::MOVNXi, Imm0 ^ Mask, 0});
+ if (Imm16 != Mask)
+ Insn.push_back({AArch64::MOVKXi, Imm16, 16});
----------------
rj-jesus wrote:
Hi, it looks like at least `tryToreplicateChunks` may also generate two or three instructions depending on how many repeated chunks it finds.
In any case, would you rather I add a parameter to control the number of instructions this method is allowed to emit? Then it could be called twice in `AArch64_IMM::expandMOVImm` with the appropriate limits of two and three. Would that be preferable?
https://github.com/llvm/llvm-project/pull/162286
More information about the llvm-commits
mailing list