[llvm] [RISCV][MC] Fix >32bit .insn Directives (PR #111878)
Sam Elliott via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 10 10:52:40 PDT 2024
https://github.com/lenary created https://github.com/llvm/llvm-project/pull/111878
The original patch had a reasonably significant bug. You could not use `.insn` to assemble encodings that had any bits set above the low 32 bits. This is due to the fact that `getMachineOpValue` was truncating the immediate value, and I did not commit enough tests of useful cases.
This changes the result of `getMachineOpValue` to be able to return the 48-bit and 64-bit immediates needed for the wider `.insn` directives.
I took the opportunity to move some of the test cases around in the file to make looking at the output of `llvm-objdump` a little clearer.
>From 0ae51f116bda27c590fa6197e2e3342b2f0e1f71 Mon Sep 17 00:00:00 2001
From: Sam Elliott <quic_aelliott at quicinc.com>
Date: Thu, 10 Oct 2024 10:39:50 -0700
Subject: [PATCH] [RISCV][MC] Fix >32bit .insn Directives
The original patch had a reasonably significant bug. You could not use
`.insn` to assemble encodings that had any bits set above the low 32
bits. This is due to the fact that `getMachineOpValue` was truncating
the immediate value, and I did not commit enough tests of useful cases.
This changes the result of `getMachineOpValue` to be able to return the
48-bit and 64-bit immediates needed for the wider `.insn` directives.
I took the opportunity to move some of the test cases around in the file
to make looking at the output of `llvm-objdump` a little clearer.
---
.../RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp | 6 ++--
llvm/test/MC/RISCV/insn.s | 35 +++++++++++++++----
2 files changed, 32 insertions(+), 9 deletions(-)
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
index 66970ed37f2724..54f1a3899c4957 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
@@ -77,7 +77,7 @@ class RISCVMCCodeEmitter : public MCCodeEmitter {
/// Return binary encoding of operand. If the machine operand requires
/// relocation, record the relocation and return zero.
- unsigned getMachineOpValue(const MCInst &MI, const MCOperand &MO,
+ uint64_t getMachineOpValue(const MCInst &MI, const MCOperand &MO,
SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const;
@@ -375,7 +375,7 @@ void RISCVMCCodeEmitter::encodeInstruction(const MCInst &MI,
++MCNumEmitted; // Keep track of the # of mi's emitted.
}
-unsigned
+uint64_t
RISCVMCCodeEmitter::getMachineOpValue(const MCInst &MI, const MCOperand &MO,
SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {
@@ -384,7 +384,7 @@ RISCVMCCodeEmitter::getMachineOpValue(const MCInst &MI, const MCOperand &MO,
return Ctx.getRegisterInfo()->getEncodingValue(MO.getReg());
if (MO.isImm())
- return static_cast<unsigned>(MO.getImm());
+ return MO.getImm();
llvm_unreachable("Unhandled expression!");
return 0;
diff --git a/llvm/test/MC/RISCV/insn.s b/llvm/test/MC/RISCV/insn.s
index e32fec25bb16b4..d24f4fe8b36374 100644
--- a/llvm/test/MC/RISCV/insn.s
+++ b/llvm/test/MC/RISCV/insn.s
@@ -170,17 +170,40 @@ target:
# CHECK-OBJ: <unknown>
.insn 6, 0x1f
-# CHECK-ASM: .insn 0x4, 65503
-# CHECK-ASM: encoding: [0xdf,0xff,0x00,0x00]
-# CHECK-OBJ: <unknown>
-.insn 0xffdf
-
# CHECK-ASM: .insn 0x8, 63
# CHECK-ASM: encoding: [0x3f,0x00,0x00,0x00,0x00,0x00,0x00,0x00]
# CHECK-OBJ: <unknown>
.insn 8, 0x3f
+# CHECK-ASM: .insn 0x6, 281474976710623
+# CHECK-ASM: encoding: [0xdf,0xff,0xff,0xff,0xff,0xff]
+# CHECK-OBJ: <unknown>
+.insn 0x6, 0xffffffffffdf
+
+# CHECK-ASM: .insn 0x8, -65
+# CHECK-ASM: encoding: [0xbf,0xff,0xff,0xff,0xff,0xff,0xff,0xff]
+# CHECK-OBJ: <unknown>
+.insn 0x8, 0xffffffffffffffbf
+
+odd_lengths:
+# CHECK-ASM-LABEL: odd_lengths:
+# CHECK-OBJ-LABEL: <odd_lengths>:
+
+## These deliberately disagree with the lengths objdump expects them to have, so
+## keep them at the end so that the disassembled instruction stream is not out
+## of sync with the encoded instruction stream. We don't check for `<unknown>`
+## as we could get any number of those, so instead check for the encoding
+## halfwords. These might be split into odd 16-bit chunks, so each chunk is on
+## one line.
+
+# CHECK-ASM: .insn 0x4, 65503
+# CHECK-ASM: encoding: [0xdf,0xff,0x00,0x00]
+# CHECK-OBJ: ffdf
+# CHECK-OBJ: 0000
+.insn 0xffdf
+
# CHECK-ASM: .insn 0x4, 65471
# CHECK-ASM: encoding: [0xbf,0xff,0x00,0x00]
-# CHECK-OBJ: <unknown>
+# CHECK-OBJ: ffbf
+# CHECK-OBJ: 0000
.insn 0xffbf
More information about the llvm-commits
mailing list