<div dir="ltr"><div>Hi All,</div><div><br></div><div>There are two instrumentations in the ASan that are related to the subject:</div><div>- pointer-subtract,</div><div>- pointer-compare.</div><div><br></div><div>They check if a pointer arithmetic operation of a given kind (subtract or compare) is not UB. The pointers must be from the same memory buffer allocation, including the one-element-after end pointer.</div><div><br></div><div>> The option must be combined with [...]

<samp>-fsanitize=address</samp>

[...] By default the check is disabled at run time.  To enable it,

add <code>detect_invalid_pointer_pairs=2</code> to the environment variable

<code>ASAN_OPTIONS</code>. Using <code>detect_invalid_pointer_pairs=1</code> detects

invalid operation only when both pointers are non-null.

</div><div><br></div><div><a href="https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Instrumentation-Options.html">https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Instrumentation-Options.html</a></div><div><br></div><div>These were good news. Bad news are:</div><div>- The compiler flags and runtime options "work" with clang but these checks are not officially supported. E.g. see <a href="https://bugs.llvm.org/show_bug.cgi?id=47626">https://bugs.llvm.org/show_bug.cgi?id=47626</a>.</div><div>- Although these are supported by GCC, I have encountered multiple issues with them in the latest GCC  10. However, most (if not all) issues have been resolved and should be shipped with GCC 10.3 or GCC 11.</div><div>  - instrumentation may break constexpr evaluation: <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97145">https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97145</a></div><div>  - these checks report valid issues in std::vector and std::stringbuf implementations in libstdc++: <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97659">https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97659</a>, <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97415">https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97415</a></div><div>  - libasan crashes when the <span id="gmail-summary_container"><span id="gmail-short_desc_nonedit_display">detect_stack_use_after_return runtime check is also enabled: <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97414">https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97414</a></span></span></div><div><span id="gmail-summary_container"><span id="gmail-short_desc_nonedit_display"><br></span></span></div><div><span id="gmail-summary_container"><span id="gmail-short_desc_nonedit_display">Despite the above issues, these checks have found some bugs in my projects. I consider them useful if you develop in UB-free pedantic mode.</span></span></div><div><span id="gmail-summary_container"><span id="gmail-short_desc_nonedit_display"><br></span></span></div><div><span id="gmail-summary_container"><span id="gmail-short_desc_nonedit_display">Have fun,</span></span></div><div><span id="gmail-summary_container"><span id="gmail-short_desc_nonedit_display">Paweł Bylica<br></span></span></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Mar 6, 2021 at 10:56 PM Demi M. Obenour via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 1/14/21 7:52 PM, Johannes Doerfert via cfe-dev wrote:<br>

> We would/should not exploit UB in such a case, at least not in the shown<br>

> example. The pointer computation might yield `poison` but that is it.<br>

> If you'd use the pointer afterwards, the situation is different though.<br>

> <br>

> ~ Johannes<br>

<br>

The problem is the following pattern in C:<br>

<br>

```<br>

#include <stdlib.h><br>

#include <stdint.h><br>

#include <arpa/inet.h><br>

#include <string.h><br>

#include <stdbool.h><br>

<br>

/* Struct for parsing results */<br>

typedef struct some_struct some_struct;<br>

/* Stream of untrusted data */<br>

typedef struct untrusted_stream untrusted_stream;<br>

<br>

/* Parse untrusted data into `output`.  Returns false on error */<br>

bool parse(untrusted_stream *stream, some_struct *output);<br>

<br>

/*<br>

 * Read `len` bytes from `stream` into `ptr`.  Returns false if it cannot read<br>

 * the full amount.<br>

 */<br>

bool read_data(untrusted_stream *stream, void *ptr, uint32_t len);<br>

<br>

/* Maximum input length */<br>

static const uint32_t MAX_INPUT = 1024 * 1024;<br>

<br>

static bool extract_be_u32(const uint8_t *ptr, const uint8_t *end, uint32_t *out);<br>

<br>

/**<br>

 * Parse some untrusted data of length `len`.  We are guaranteed that `ptr` points<br>

 * to `len` bytes of data.<br>

 * @param ptr Pointer to `len` bytes of untrusted data.<br>

 * @param len The length of the untrusted data.<br>

 */<br>

bool process_untrusted_input(const uint8_t *ptr, const size_t len, some_struct *bar) {<br>

   const uint8_t *end = ptr + len; /* end is now one-past-the-end */<br>

   uint32_t foo;<br>

   /* some parsing code */<br>

   if (!extract_be_u32(ptr, end, &foo))<br>

      return false;<br>

   foo = ntohl(foo);<br>

   /* more parsing code follows that uses foo */<br>

   return true;<br>

}<br>

<br>

bool extract_be_u32(const uint8_t *ptr, const uint8_t *end, uint32_t *out) {<br>

   uint32_t res;<br>

   /* this is the bug: it should be `end - ptr < sizeof res` */<br>

   if (ptr + sizeof res > end)<br>

      return false;<br>

   memcpy(&res, ptr, sizeof res);<br>

   *out = ntohl(res);<br>

   return true;<br>

}<br>

<br>

/* Parse some data from `stream` into `output`.  The data is not trusted. */<br>

bool parse(untrusted_stream *stream, some_struct *output) {<br>

   uint32_t len;<br>

   if (!read_data(stream, &len, sizeof len))<br>

      return false;<br>

   len = ntohl(len);<br>

   /* Avoid DoS attacks */<br>

   if (len > MAX_INPUT)<br>

      return false;<br>

   uint8_t *ptr = malloc(len);<br>

   if (!ptr || !read_data(stream, ptr, len))<br>

      return false;<br>

   bool res = process_untrusted_input(ptr, len, output);<br>

   free(ptr);<br>

   return res;<br>

}<br>

```<br>

<br>

This code has undefined behavior if `process_untrusted_input` is passed less<br>

than 4 bytes of input, which an attacker can often arrange to happen.  However,<br>

I know of no way to detect this, or even test for it.  Sanitizers and valgrind<br>

won’t catch it, since the out-of-bounds pointer is not dereferenced.  And the<br>

amount by which the pointer is out of bounds is bounded, so it won’t wrap<br>

in practice.  And if Clang optimized away the bounds check, the result is a<br>

security vulnerability.<br>

<br>

This isn’t just theoretical; I have seen similar bugs in the wild.<br>

Fortunately, I haven’t seen any cases of them being miscompiled.<br>

<br>

Will Clang miscompile code like this?  Even though it is undefined behavior,<br>

it happens often enough that I would prefer if Clang made it defined as<br>

an extension.  Dereferencing the out-of-bounds pointer would still be UB, of<br>

course.  I believe there is precedent for this, in that type-punning via unions<br>

is undefined behavior in C++ but (to my knowledge) works on both GCC and Clang.<br>

Also, GCC supports `-fwrapv-pointer`.<br>

<br>

Sincerely,<br>

<br>

Demi<br>

<br>

_______________________________________________<br>

cfe-dev mailing list<br>

<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

</blockquote></div>