<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/68855>68855</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Clang's Optimization Introduces Unexpected Sign Extension in RISC-V Bit-Field Operations
</td>
</tr>
<tr>
<th>Labels</th>
<td>
clang
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
gyuminb
</td>
</tr>
</table>
<pre>
### **Environment:**
- Compiler: Clang-18
- Target Architecture: RISC-V
- Optimization Level: **`-O1`**, **`-O2`**, **`-O3`**
- OS: (Ubuntu 22.04.2)
### **Summary:**
While compiling code that deals with bit field operations and type casting, an unexpected behavior was noticed with optimization levels **`-O1`**, **`-O2`**, and **`-O3`** in Clang for the RISC-V architecture. The behavior deviates from the expected results based on the C language standard and is not observed in the **`-O0`** optimization level.
### **Steps to Reproduce:**
1. Compile the provided source code with Clang targeting RISC-V architecture.
2. Use optimization levels **`-O1`**, **`-O2`**, or **`-O3`**.
3. Execute the compiled binary.
### **Expected Result:**
```
resultValue1: ffff
resultValue2: 0
```
### **Actual Result:**
```
resultValue1: ffffffff
resultValue2: 0
```
### **Source Code to Reproduce:**
```c
#include<stdio.h>
typedef struct {
unsigned int bitField : 13;
} CustomStruct;
unsigned int resultValue1 = 0;
short resultValue2 = 0;
CustomStruct customArray[2] = {{0U} , {0U}};
int main()
{
resultValue1 = (unsigned int) ((unsigned short) (~(customArray[0].bitField)));
printf("resultValue1: %x\n", resultValue1);
resultValue2 = (short) (customArray[1].bitField);
printf("resultValue2: %x\n", resultValue2);
return 0;
}
```
### **Observation:**
The value for **`customArray[0].bitField`** is a 13-bit unsigned integer defined as a bit field. When all bits of this field are inverted using the **`~`** operator, all 13 bits are set to 1, producing a value of 0x1FFF.
Casting this value to **`(unsigned short)`** results in a 16-bit (2 bytes) value, which should then be 0xFFFF.
Further casting this value to **`(unsigned int)`** should maintain the value at 0xFFFF. This is the expected behavior as per the C language standard for type casting.
However, in the provided code, while this is the case without optimization (-O0), with optimization the value unexpectedly becomes 0xFFFFFFFF. It seems that after the cast to **`unsigned short`**, the extension to **`unsigned int`** isn't carried out correctly, possibly sign-extending rather than zero-extending the value.
This unexpected behavior suggests a potential issue with either a specific implementation of the RISC-V architecture or with this version of the Clang compiler. Such an action deviates from the expected behavior of standard C, indicating a probable compiler bug.
### **Additional Information**:
- https://godbolt.org/z/Pv3Gaacv9
- The issue seems to stem from the **`slli`** and **`srli`** instructions used in succession in the optimized versions, resulting in sign-extension.
### **Recommendation**:
Please verify the behavior observed using the provided Godbolt link and investigate the underlying cause in the Clang compiler for RISC-V. It's essential to ensure consistent behavior across optimization levels and adherence to the C language standard.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysWM1u4zgSfhr6UrAhUXFsH3xInHi2gQV60OmeOVNUyeIuRRok5Y77MM--KFKW5LSTHew2IDiQSFZ99f8xwnt1MIhbtnxky6eZ6EJj3fZw7lplyllpq_OW8SI9wPgD4w_P5qScNS2awIqH9I1lTyzrf-ews-1RaXSseICdFuYwz9eXta_CHTDAg5ONCihD55C2ffn0spv_cdn0-RhUq36IoKyBf-IJNe3pVd1n8885u8_6V76bLvD3Fopx4aLkJQldfys7EzrgfJHdLTjjm6k1b61_6dpWuPNNy_9slEaQ0XplDiBthRAaEaBCoT18V6GBUgWoFeoK7BFdNNGDMBWE8xFBCh-UORB4YaAz-HpEGbCCEhtxUtbBd-HB2KAkVkmgnTpLk7P8_-AqQnDTXaBMCiLU1kFosI8ViEkIF_C1wRFjhSclAnqonW3jmcEOh77TwUMpPFZgTVzdASnoxAHBB2Eq4aoISEVTwZYe3QkrgkLbJzizEefPflh8GMiARw_Bwhc8Olt1Em_GNF9c0jmqPjp7UhVW4G3nJKYQxzAkH4WY3hT8W15KIvkCvnn8RXGz7nbYel3FAp5fUXYhwU-5SemkjHDnDx30fAnZlxiym94hdemJrym4fwjdYU7FVdd1_dMKp5Xs5vnbQB5k6IT-P2H8GigvKeq7WNj_LXUGiXIQpozUXYWs2PlQKbtoWPE8PUI9oMIafHCdDMBWj-k7dCY2aiqBQC1kHzsIwc8LVvS72OoJdp0Ptn2J58eF-HslY-ojYMUTZMNu31h3tYG_2TDVATK-PDgnzmz5yNnyKe4m6KvH7Bthigncv9BzDYvQtEIZxtdj7x0MB_gZKuPrqS2MbyAeHr9GE_rvfzG-vgaZseXT4uJE0pmeCyw4OmVCHUXyt7nE-PKVLXeGcU5mXS1PZNxEzy_op_CuoeVvoRWPADBKewca_xgavzIvggqdM5OQUlj-fhl8jg05dq6bmU_D4ESK48wY-tNHURjHjQcBeTGnOTmNMh6QBkut6F3QpmGSLuDPBg0IrembB1tDaJTvx6xwCMqc0FEv6zz15qsh8td0hNBEti4ORK0hL5JAEuExUMnntJaqniSJ3k5bQ_aa7_f7q5a6S9M8oUkbgx0138rYEcxlUipDDrmPDmF8zaE8B_SUPVEi4fneKNmQiE5XZJyBEiF73b_Fs-9caNBdWMbfwZXqa0TVK6GCDaKfx0mCCBeV8JXkKn89-Qd2IDwc0b07-CPNmHChKwv-Yb_jCWOAeuXDRKZR3DsjDusRghQ-DWnbheuxy_iaOATV_-4GmxqNG5mYPkOJ0rboe3OTyZ8CeMTWJ74n6tBbSEZcOfdNxKezPLkroPFR-a1DFI5JqRjGVwGkcE4Rm-oCSOscyqDPMU-t96rUZ6DT8yi6org7EdMgNMLAD3R2sjSYvLguaOVvslHfHQ7oqUTgaAOaoIQG5X3XsyJUUZMAf0SpaiVBtUeNdHlILo61epNTErWJMlKWovOT_Ylt9XTGLeClkw0xZiGj1A8I6ADd1mPO7VJCVUqKkMr66GwpyoHOo4OyO3zIlx6qSpFyoeGTqa1rU39MISzeXJGaEI4-9s494_uDrUqrw8K6A-P7H4zvfz8VvwkhT5vh3tRg79c-zSz4gO1o4JArXms15sgVr_duuqRMohnxBtL5xK99JyX66Oq-wPqKwOoSAz9OF_IVHRrSi9Y_9NIXqp0WTfW-d37XSAV7Qqfqc4QwxuxyExj7-FD_vyUnglbm3-n2YE7ogzqInvx2pkKnz_FmJjqPFwOvcyn2n5SOVNWMrzyg931qBwtoPGWntMYrTyk_6WzSWe9vMnsCJKoGHRoZu-077W8xq7ZFtSk2Yobb_H6zyjZFvuGzZrvcyE12t1zd3eV1tZQlz5ebJd5vKp5vChSrmdryjBd5lvPs_u6-4AuRrzf1er2qcilQLDfsLsNWKL3Q-tRSss1iRm3v1-vlcqZFidrH_wRwLgkYEYnl08xtaf-87A6e3WVa-eBHCUEFjdtd2r7y11f3TyYkfuzh29g8XtTBwPPQ55S5VP-jCvNEbD8PV-NZ5_T2TbWo0HTlQtqW8T0B6f_Mj87-C2VgfB_t8ozvo2n_CQAA__9WmjZR">