<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/121909>121909</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Change in behavior: Comparing two pointers after adding an offset to the first pointer
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          frankmehnert
      </td>
    </tr>
</table>

<pre>
    Clang-20 (I'm using the current Debian version `++20241212011120+0cbdad4bd239-1~exp1`) changed the behavior in pointer comparison. This affects, for instance, uclibc which contains the following [code](https://github.com/kraj/uClibc/blob/master/libc/string/generic/strnlen.c#L30):
```C++
size_t strnlen (const char *str, size_t maxlen)
{
  const char *char_ptr, *end_ptr = str + maxlen;
  ...
  if (__builtin_expect (end_ptr < str, 0))
 end_ptr = (const char *) ~0UL;
```
I hope the intention of this code is clear: The `maxlen` parameter can be also very huge (`~0UL`), in that case this function behaves like `strlen()` without limit. The comparison is intended to check this case: The expectation is that adding `~0UL` to `str` makes the result (`end_ptr`) _smaller_ than the original `str` value.

Clang-19 behaves like a user would expect and performs the comparison and limits `end_ptr` to `~0UL`.
Clang-20 makes the comparison a NOP.

I assume that the pointer comparison is wrong because `str` and `end_ptr` obviously point to different objects, correct?

Changing the code to _variant 1_
```C++
if (__builtin_expect ((uintptr_t)end_ptr < (uintptr_t)str, 0))
  end_ptr = (const char *) ~0UL;
```
does not help either, same result, but _variant 2_
```C++
if (__builtin_expect ((uintptr_t)str + maxlen < (uintptr_t)str, 0))
  end_ptr = (const char *) ~0UL;
```
makes Clang-20 behave like expected.

For the sake of completeness, using _variant 3_
```C++
if (__builtin_expect ((intptr_t)end_ptr < (intptr_t)str, 0))
  end_ptr = (const char *) ~0UL;
```
would also force Clang-20 to perform the comparison, but it would be a signed comparison which would return unexpected result in edge cases (for a string located at `LONG_MAX`).

**Questions** Is the uclibc code indeed incorrect and in that case, is Clang-20 correct to demand _variant 2_?
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy0Vk-PGrkT_TTmUhrU7QYGDhwYRvw0Un7JrpSV9obcdjXt4LaR_0Cyh3z2VdkNwySbPWQVCanVjV31XtWrZ4sQ9MEirtn8ic2fJyLF3vl154U9Dthb9HHSOvVlvTXCHh54BYwvXxh_HCAFbQ8QewSZvEcb4RlbLSyc0QftLLBFxfgT40-84rOa17yq65rTt0q2SqhZq3izeqi_4udTnRevQPbCHlDlsC324qydB23h5LSN6EG64SS8Ds5O4WOvA4iuQxkD41vo8tIQhZVI70ka3Uq49Fr2IJ2NQtuQI3fOGHch-Gz-JJ1CNn9mfNnHeAqs2TC-Y3x30LFP7VS6gfHd0YtPjO_SlkIyvmuNaxnfDSJE9Izvxs8hem0PtBktej1-sgbtVDLevGuIJGWoNkQ4_7alSKzaBP0X7iOMO6jS0tkQqSgeGN-E6InXuGwQnw1aildt2CPtB3i7np77U9nE-AatojdgzTPlAMafrkGasn06nean7ij5ft8mbaK2e_x8Qhnp22uMLYxwMqWMAu4zfIee2vu1-uNdSXajz6rNC_TuhLkz1GUbST2ug0gNpvYAPQ0Kz5oNfOyRpDUiX1RwEl4MmNUhLLQIwgRHKvwCfTogIWGLKqdeFLBbklTsRQQpApY8XbIyJ86ywwBGH3OiEH2u85K2Liq46Ni7FMHoQcdphvOqSkKaOSgSsQPZozyORETAK_5SUJHz6VCgCKWyIm9QaX_JTy-DOGJRr8eQTBxpjSUfx2cfBmEM-j2FtHm18_qgrTB3oc7CJKROs2pT5rpevaUtIAX0cHHJqBEsCKvghL5zfig47ljTf7keAd5gGimMhKa3fPyez30ceP_htxHZC4gQ0oClOrTwexOg4l28swdoUYoU8I4lYXoLxrVn7VIwX0okAqd012E2L9d-uhqJdN6jjKzZjTUiU7qZHekxOtifhdfCRqj3_zzNPx4ixpdJ23iKfh8ZX93P1Dd_fT9iPztjymEA6yL0aE6AOvZYzEQMV0XRa5viKzP-X5m9dZlfS7Do6SavIuei5oIP1SisnfO5k0EckWyG5GQwosWQ218OtlsVmp-pwo_b-0vIl0HNvtc5L_G1DtFdh_abWbt2W8dxzMk4Id8G1P2EleOzLPEYk7eQ7LWiVy_SFlAdMHtcIOB0FgsoxyEYJwUtFpEG8t2H9__b_3_zZ7GssSeZ3eb3hIE8MZRXeCkGMR7k5SSwClGBtuOQ5jG_d_Ps7ndCuK6jYceBVt_ru9lN1LpRq2YlJriuH5sFnzX1opn066pB1c7bji9rKQU2slLNctbImWhRzOfdRK95xedVXT1Wq9miaqZzseiauqnbSqnVasnZrMJBaDM15jxMnT9MdAgJ1zWvV9VqYkSLJuSbF-cWL5D_ZZzTRcyvadNDmw6BzSqjQwyvYaKOBtfZlqgit5sSHS_b0jpyq4u7Wibdk8g5xzNG0PHaBcxVyRci7UO8Lp4kb9b_chciGOPj4eQd2Sbjuww-ML4b2Z3X_O8AAAD__yuMUaE">