<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/60685>60685</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Missed Optimization: Superfluous Addressing on Data Load
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
geometrian
</td>
</tr>
</table>
<pre>
Consider the following simple code (which converts a hex digit character, case-insensitive, to an `int`):
```cpp
int hexchar_to_int(char hexchar) {
constexpr static unsigned char const data[] = {
255, 10,11,12,13,14,15, 255,255,255,255,255,255,255,255,255, 0,1,2,3,4,5,6,7,8,9
};
return data[ hexchar & 0b00011111 ];
}
```
GCC compiles this to:
```as
hexchar_to_int_table(char):
and edi, 31
movzx eax, BYTE PTR hexchar_to_int(char)::data[rdi]
ret
```
However, Clang has an extra instruction:
```as
hexchar_to_int(char):
and edi, 31
lea rax, [rip + hexchar_to_int(char)::data]
movzx eax, byte ptr [rdi + rax]
ret
```
I believe GCC's output is strictly superior.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJycVMFu5DYM_Rr5QmQgy7E9PviQeDrbAi1atHvpKZBlzpiFRjIkejLZry9kO8hmuyjSGgYFkuITH58gHSOdHWIrykdRHjI98-hDe0Z_QQ6kXdb74aXtvIs0YAAeEU7eWv9M7gyRLpNFMH5AEGr_PJIZwXh3xcARNIx4g4HOxGBGHbRhDEJ1YHTEO3IRXSSmK6YYe9AORCXJsaikUI0oHoQ8CPmQ3OU307RGyHHCTqBP7J9Sidon7zUqVAOiftzqZWO8i4y3KUBkzWRgdgvvYWkMljQMmvU6BhDF4V29kI0qy9RnLoXq8jwZlUyRzH0yS3rd9V8tLKDJFapLiAkwJSqhulqobi9U17x1Ux9E8dZcQJ6De23_dQQgVAWyl1Lm6QNRflVUH74Z7ep-6jow_jKRxQg8UgT2_5RBxzXwXoEn1r3FTYev5YPt025YVhwoUS7yLX3x1y-3FNe3FH_88_MP8Nvn378v8AZcPGxsw0CJ2IoUkL9La7U_-me8rhews9qdYdQx3Tm8cdBALnKYDZN3H2X8P7ha1Es8rFwTAZpAqMcPsT28P2Qd3Da2_oURJg6wDmXBTKd8W_TvM_oJerSEV4RPXSdUHcHPPM0MFCFyIMP2BeI8YSAfdtnQFkNTNDrDNq_qKt9XTZlnY1sU5mT2xhSyylVf6Ead6pNp-qKQTWkqk1GrpCqkypWsVZPL3R6HWt2Xpi72eV2dUNxLvGiyO2uvl50P54xinLGtZLUvM6t7tPH1zQpt2nTXz-co7qWlyPGtjIkttr9QjDjArxPThb7oTWX4IzE52dnPER6GIWCM6VnzDg6aNfzs9ZDNwbYj8xSTEuoo1PFMPM79zviLUMd00LbcTcH_hYaFOi7NRqGOS79_BwAA___UYo6Y">