<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/64451>64451</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            libc++ regex drops the character between a zero-length match and its subsequent match
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          rprichard
      </td>
    </tr>
</table>

<pre>
    With libc++, when std::regex_iterator advances past a zero-length match (e.g. `""`, `"^\\s*|\\s*$"`), it skips the character immediately after the zero-length match. As a result, std::regex_replace also drops that character.

Example:
```
#include <cstdio>
#include <regex>
int main() {
    std::string src = "AB";
    std::regex pattern("");

    // Expected 'xAxBx'. Actual output is 'xxx'.
 std::string repl = std::regex_replace(src, pattern, "x");
 printf("'%s'\n", repl.c_str());

    // Expected ['', 'A', 'B']. Actual output is ['', '', ''].
    std::sregex_iterator begin { src.begin(), src.end(), pattern };
    std::sregex_iterator end {};
 for (auto i = begin; i != end; ++i) {
        std::smatch m = *i;
 printf("'%s'\n", m.prefix().str().c_str());
    }

    return 0;
}
```
libstdc++ prints the expected output above.

This bug was originally reported against the Android NDK, https://github.com/android/ndk/issues/1911, where the pattern was `"^\\s*|\\s*$"`

I wonder if operator++ need to adjust the prefix backward one character when it skips one character forward here:

https://github.com/llvm/llvm-project/blob/llvmorg-17.0.0-rc1/libcxx/include/regex#L6512

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyUVdGOqzgPfpr0xhoEoUC54IK2U-nXv9qrlfbyKCQu5EyasEmY6ezTrxJoO-3prs6REBDbsT_7cxzmnOw1YkOKLSn2Kzb5wdjGjlbygVmx6oz4bP6UfgAlO07oNj47-BhQg_OC5C3JW4s9nr9Jj5Z5Y4GJd6Y5OhiZ88Dgb7TmRaHu_QAn5vkAhG4w6RMgZUooDU_42V3WxSspdqTYOUJbUu2-LOj6YlwHe-nBvcnRgR8QAmLGPVqQpxMKyTyqT2DHIAn6H2Ak0DpgYNFNygd3DwlZHBXjCEw5A8KaGIf5W6CEpHuStvP79cxOo8LgYJaW6fLMS5pLzdUkEEi-484LaUj--kwZo191Uns4MakJ3RBaA6m2sxwAboCdt1L34CwHku-BUNpuQ6XyZ8bRP4zMe7Sz20gBra_mt02EHgg9wOt5RO5RAKHVuT1vz4RWCbTcT0yBmfw4eZAuas9Rt3h4BBhKGhE-LzWhG2d5oOKKbheSOT_gg9FK7Y8X7BWhhQufYqejYBcDJfyb83au209mV2yjt2qOW7W336go9s-Sfth0_1fsk2d8PRyZDnupA7eBwiSuLrh3UYRa3ARLcYBU--cMP7pHLWLjfLU_GhvOIZu8ARk5mcPm27CkWZCEqPkW5nMvf2i_-5jz0T4tDdjKX2DrlIwWj_I8p5hcWfs3BiN31f6RTot-shrSG9NXm_vDqGTnvFjm2QxvniF46YSFXtaZd7w75n8M0kE39fDBHBgre6mZUp-h4YwNO1nPpHY-umu1sEYK-H3__5Dm4P3oQrVi2_XSD1OXcHMi9MBmS0IPWrwRepDOTegIPWR1li0T12J0emE_APj5gfklhf_Bh9EizMkjmHFukaUUGlGAN8DE92nJYWYGOsbfPpgVYPTXURsvgusYvtcdjY07AvLbWIzv_6iEUu-Xz8tozXfkntBDp0y3SI3tX7IqSZP0xfIsCGXHw9Q5LDOU0MM8Qmn-W1lkdA65Ek0u6rxmK2yyss6yOqurbDU0vEhpUVc8RZ51nDFaMbbZ1CLnBaeIdCUbmtI83aRrmud0XSYdK8qSVceyTI_1uhRkneKJSZUEeImx_SrS15TrdZGtFOtQucsVa5uYWTf1jqxTJZ13t21eeoXN7aqFeVRf7p6vxe3QfyDqp9cr0wKkd-CmzuFfE8YbxPNhNVnV_HLpr60Y0_knAAD__77Yazw">