<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - unsafe pointer arithmetic in llvm_regcomp()"

   href="https://bugs.llvm.org/show_bug.cgi?id=48649">48649</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>unsafe pointer arithmetic in llvm_regcomp()

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Support Libraries

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>miod@trust-in-soft.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>llvm/lib/Support/regcomp.c is borrowed from OpenBSD, to which the following

issue has been reported and fixed. (report and patch in

<a href="https://marc.info/?l=openbsd-tech&m=160923823113340&w=2">https://marc.info/?l=openbsd-tech&m=160923823113340&w=2</a> )

regcomp.c uses the "start + count < end" idiom to check that there are "count"

bytes available in an array of char "start" and "end" both point to.

This is fine, unless "start + count" goes beyond the last element of the array.

In this case, pedantic interpretation of the C standard makes the comparison of

such a pointer against "end" undefined, and optimizers from hell will happily

remove as much code as possible because of this.

An example of this occurs in regcomp.c's bothcases(), which defines bracket[3],

sets "next" to "bracket" and "end" to "bracket + 2". Then it invokes

p_bracket(), which starts with "if (p->next + 5 < p->end)"...

Because bothcases() and p_bracket() are static functions in regcomp.c, there is

a real risk of miscompilation if aggressive inlining happens. The following

diff rewrites the "start + count < end" constructs into "end - start > count".

Assuming "end" and "start" are always pointing in the array (such as

"bracket[3]" above), "end - start" is well-defined and can be compared without

trouble.

As a bonus, MORE2() implies MORE() therefore SEETWO() can be simplified a bit.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>