[llvm] [feature][riscv] handle target address calculation in llvm-objdump disassembly for riscv (PR #109914)

Tue Nov 19 12:35:01 PST 2024

================
@@ -230,6 +246,97 @@ class RISCVMCInstrAnalysis : public MCInstrAnalysis {
     return false;
   }
 
+  bool evaluateInstruction(const MCInst &Inst, uint64_t Addr, uint64_t Size,
+                           uint64_t &Target) const override {
+    switch(Inst.getOpcode()) {
+      default:
+        return false;
+      case RISCV::ADDI: {
+        if (auto TargetRegState = getGPRState(Inst.getOperand(1).getReg())) {
+          // TODO: Figure out ways to find the actual value of XLEN during analysis
+          int XLEN = 32;
+          uint64_t Mask = ~((uint64_t)0) >> (64 - XLEN);
+          Target = *TargetRegState + SignExtend64<12>(Inst.getOperand(2).getImm());
+          Target &= Mask;
+          return true;
+        }
+        break;
+      }
+      case RISCV::ADDIW: {
+        if (auto TargetRegState = getGPRState(Inst.getOperand(1).getReg())) {
+          uint64_t Mask = ~((uint64_t)0) >> 32;
+          Target  = *TargetRegState + SignExtend64<12>(Inst.getOperand(2).getImm());
+          Target &= Mask;
----------------
arjunUpatel wrote:

As per the specification for ADDIW

> adds the sign-extended 12-bit immediate to register rs1 and produces the proper sign-extension of a 32-bit result in rd. Overflows are ignored and the result is the low 32 bits of the result sign-extended to 64 bits

Once the addition is complete in line 268, we must ignore the overflow. The `&=` is used in tandem with a mask wherein the top 32 bits are unset, while the bottom 32 bits are set. The result is the unsetting of the top 32 bits, while the low 32 bits are preserved in the target, thus ignoring arithmetic overflow.

https://github.com/llvm/llvm-project/pull/109914