<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/133712>133712</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [RISCV] Incorrect disassembly of c.slli when umm > 31
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          paulhuggett
      </td>
    </tr>
</table>

<pre>
    
# context:

During disassembly, a binary can contain an invalid instruction and `objdump` says that this instruction is unknown.

For example, Bytecode 0x9005 represents instruction `c.srli x8, 0x21` and, for RV32C, `c.srli` is only valid when the immediate is in the range [1, 31].

If I try to disassemble a binary with Bytecode 0x9005, I get:

```
$ cat disasm.s
.insn 0x9005    # c.srli x8, 0x21
$ clang -c --target=riscv32 -march=rv32imafdc disasm.s -o disasm.o
$ llvm-objdump -M no-aliases -d disasm.o

disasm.o: file format elf32-littleriscv

Disassembly of section .text:

00000000 <.text>:
 0: 9005          <unknown>
```

# description of the problem:

According to RISC-V Instruction Set Manual, for RV32C, instruction `c.slli` is only valid when the immediate is in the range [1, 31]. For all base ISAs, the code points with `rd = x0` are HINTs, except those with `shamt[5]=1` in RV32C (where `shamt` is the shift amount immediate). For RV32C, the code points with `shamt[5]=1` are designated for custom extensions.

As a consequence, the following instruction is invalid (with no custom extension):

```
$ cat compile.s
c.slli x0, 0x21
```
LLVM verifies this before compilation:
```
$ clang -c --target=riscv32 -march=rv32imafdc compile.s -o compile.o
compile.s:1:12: error: immediate must be an integer in the range [1, 31]
c.slli x0, 0x21
 ^
```
But, if a binary contains such an invalid instruction, `objdump` says that it is valid:
```
$ cat problem.s
.insn 0x1006  # c.slli x0, 0x21
$ clang -c --target=riscv32 -march=rv32imafdc problem.s -o problem.o
$ llvm-objdump -M no-aliases -d problem.o

problem.o:  file format elf32-littleriscv

Disassembly of section .text:

00000000 <.text>:
       0: 1006          c.slli  zero, 0x21
```

The problem occurs with the 32 following Bytecodes:

- 0x1002 is disassembled to `c.slli  zero, 0x20`
- 0x1006 is disassembled to `c.slli  zero, 0x21`
- 0x100a is disassembled to `c.slli  zero, 0x22`
- 0x100e is disassembled to `c.slli  zero, 0x23`
- 0x1012 is disassembled to `c.slli  zero, 0x24`
- 0x1016 is disassembled to `c.slli  zero, 0x25`
- 0x101a is disassembled to `c.slli  zero, 0x26`
- 0x101e is disassembled to `c.slli  zero, 0x27`
- 0x1022 is disassembled to `c.slli  zero, 0x28`
- 0x1026 is disassembled to `c.slli  zero, 0x29`
- 0x102a is disassembled to `c.slli  zero, 0x2a`
- 0x102e is disassembled to `c.slli  zero, 0x2b`
- 0x1032 is disassembled to `c.slli  zero, 0x2c`
- 0x1036 is disassembled to `c.slli  zero, 0x2d`
- 0x103a is disassembled to `c.slli  zero, 0x2e`
- 0x103e is disassembled to `c.slli  zero, 0x2f`
- 0x1042 is disassembled to `c.slli  zero, 0x30`
- 0x1046 is disassembled to `c.slli  zero, 0x31`
- 0x104a is disassembled to `c.slli  zero, 0x32`
- 0x104e is disassembled to `c.slli  zero, 0x33`
- 0x1052 is disassembled to `c.slli  zero, 0x34`
- 0x1056 is disassembled to `c.slli  zero, 0x35`
- 0x105a is disassembled to `c.slli  zero, 0x36`
- 0x105e is disassembled to `c.slli  zero, 0x37`
- 0x1062 is disassembled to `c.slli  zero, 0x38`
- 0x1066 is disassembled to `c.slli  zero, 0x39`
- 0x106a is disassembled to `c.slli  zero, 0x3a`
- 0x106e is disassembled to `c.slli  zero, 0x3b`
- 0x1072 is disassembled to `c.slli  zero, 0x3c`
- 0x1076 is disassembled to `c.slli  zero, 0x3d`
- 0x107a is disassembled to `c.slli  zero, 0x3e`
- 0x107e is disassembled to `c.slli  zero, 0x3f`


# expected result:

Since these 32 Bytecodes represent invalid instructions, `objdump` should report that the resulting instructions are unknown.

</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy8WF2PozoS_TXOSykR2EA6D3lId6a1kWb2YXrU78YUwbPGztqmO72_fmUgH-6PUbi6umhGHcDHVaeqTtmGOyf3GnFN8nuSb2e8842x6wPvVNPt9-j9rDTV25okG0IZCKM9Hj1hm_Ag2Ww7K_UeKum4c9iW6o3QB-BQSs3tGwiuewiXGrgGqV-4khVI7bzthJcmPK6AFIkpf1ddeyBFAo6_OfAN9-Ab6aLB0kGn_6PNq14M9h-NBTzy9qAwGL5_8yhMhZAcV0mSg8WDRYfax9OQIhELZ5WE412AJUeaBstcV-G2NhZ-PjP6EG7OY8MA6cBo9QYDjdcGNfgGQbYtVpJ7hN7f_pnleo9A8vs0zMJSkm9Hn3c17MDbN_DmKnB4idqr9M17KmGSHezxEnpSJOO_kJoMBPfDdO3CkWSzkNrpUxwAoM_eB9IjVHG9h7mA-dxz2xvZWunEC6Mwb7kVTXjwwqhseV2JsxmYm9NvM06l1Es7H5MJ8x-gzZwryR06mFfR4GRzvmMbqKXCEPiWe0BVMzpX0nuFvRtjrV2KDEwNDodkLqKCTMYLCHsY33wbXkIS7JzCMVyEPYz1FIa9i-lQ8RU6YeWhN2XqPrUHa0qF7dnmRghjq6ADb-Dn7ulh_gy7q3J7Qg8_uO64-lBdH6tS_S2VBkEYXCkouUPYPW1ceBdG9zV1MDJooi80UiS2AsK2cEx6EViEf-3-_atH4FHgIQjRODwPdw1vPcnvc5JvCdv20pF6IAWE3r02aPEycGATbLtG1h54azrtL1QIXQ3-nqPylZ-fGQ7uVhiaGPdY9dEVnfOmBTx61E4a7UbhbRzw0I4c_rdDLfBkqjZKmdeQv3e95tSvAqfggjYf5iZ09UdFCtMepMJekkN2Q5iv5HeF-f79-Qe8oJW1RDc0vxJrY3GchQe_RmvvTU1U8NmtIOHTTZDlxV-2ScN_GlSD1hobflzqr-2chxKHtu5xj_YP9fgFeSD5e9Xdd77XRX21iAwLiAPXieaLZWTs1J-uItKHXPaQL4LH_UnTUetMk6Q4N86PmfsLYT9bCWE_3dzWOqPRyeZyyzbwT3TP4ep76BCX0zXGBv6H1nxR2CTZ_Lr0TTBCdHZUdSgYRq8keFr33Nmp-ZAKGrJ4tV5Woduee2ZkfzQ7P-XwZmAaAfntQBoB8XYguwamEzhmEXACxzwCTuBYRMAJHJfXQDqB410EnMBxFQEncOQRcALH8hrIJnAUEXACxyoCTuCIEXACx_oamN3OkUV6zG7nyCI9ZrdzZJEes9s5skiP-QSOkR7zCRwjPeYTOEZ6zCdwjPRYTOAY6bGYwDHSYzGBY6THYgLHSI_LCRwjPS4ncIz0uJzAMdLjcgLH-nJQGc4qeDygCLtgi65Tl2X9SWqBYal1_WJ7XmIv5-PP9lTuk01VYzoVpj8Y609ndBzNvdtAu35rfnVcn1VrVq3Yis9wnS4zltFstWSzZk2XWAualumq4FXNVuJuRSnmxarIV-ndspjJNU1onjCWpjSjOVtUaXa3TIsS66rMkkqQLMGWS7UIe6iFsfuZdK7DdcrYMqUzxUtUrv_EQanGV-jfEkpJvp3Zdb_xKru9I1mipPPuMo2XXvXfRsKx7pnkW9hpYaxF4a8_eoQd1Zig_qjWtS0Q9g1YOuusWjfeH_r9DH0k9HEvfdOVC2FaQh-DpfHP_GDNbxSe0MfeP0fo40jgZU3_HwAA__8csHXq">