<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Instrument <, <=, >, >= and - on pointers to find when unrelated pointers are compared (subtracted)"
   href="http://llvm.org/bugs/show_bug.cgi?id=18989">18989</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Instrument <, <=, >, >= and - on pointers to find when unrelated pointers are compared (subtracted)
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>new-bugs
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>new bugs
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>kcc@google.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>chandlerc@gmail.com, llvmbugs@cs.uiuc.edu, nlewycky@google.com, richard-llvm@metafoo.co.uk
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Comparing (<, <=, >, >=) and subtracting pointers to different objects in C++ 
is undefined behavior.
There are cases when this kind of UB really hurts (e.g. sorting pointers by
their value, 
then iterating over the sorted container and depending on the order of the
elements).

We've got a feature request for AddressSanitizer to implement a detector for
this kind of UB:
<a href="https://code.google.com/p/address-sanitizer/issues/detail?id=269">https://code.google.com/p/address-sanitizer/issues/detail?id=269</a>

A naive implementation is trivial: insert a function call before all relevant
instructions 
and do the check inside the run-time (how to make this fast is another
question). 
See r202389.

Example 1: "p1 < p2" 

% cat cmp.cc 
bool cmp(char *a, char *b) {
  return a < b;
}

int main() {
  char *a = new char;
  char *b = new char;
  cmp(a, b);
}
% clang -g -fsanitize=address -mllvm -asan-detect-invalid-pointer-pair=1 cmp.cc
&& ASAN_OPTIONS=detect_invalid_pointer_pairs=1 ./a.out
=================================================================
==3998==ERROR: AddressSanitizer: invalid-pointer-pair: 0x60200000eff0
0x60200000efd0
    #0 0x4801b2 in cmp(char*, char*) /tmp/cmp.cc:2
    #1 0x4803b0 in main /tmp/cmp.cc:8

0x60200000eff0 is located 0 bytes inside of 1-byte region
[0x60200000eff0,0x60200000eff1)
allocated by thread T0 here:
    #0 0x4657f3 in operator new(unsigned long)
    #1 0x480304 in main /tmp/cmp.cc:6

0x60200000efd0 is located 0 bytes inside of 1-byte region
[0x60200000efd0,0x60200000efd1)
allocated by thread T0 here:
    #0 0x4657f3 in operator new(unsigned long)
    #1 0x480349 in main /tmp/cmp.cc:7


Example 2: "p1 - p2" 

% cat diff.cc 
long diff(char *a, char *b) {
  return a - b;
}

int main() {
  char *a = new char;
  char *b = new char;
  diff(a, b);
}
% clang -g -fsanitize=address -mllvm -asan-detect-invalid-pointer-pair=1
diff.cc && ASAN_OPTIONS=detect_invalid_pointer_pairs=1 ./a.out
=================================================================
==4054==ERROR: AddressSanitizer: invalid-pointer-pair: 0x60200000eff0
0x60200000efd0
    #0 0x4801b2 in diff(char*, char*) /tmp/diff.cc:2
    #1 0x4803b0 in main /tmp/diff.cc:8
...



But, std::less on pointers is supposed to be legal, yet at the IR level it
looks the same. 
So, this case is a false positive:

% cat less.cc 
#include <functional>
bool cmp(char *a, char *b) {
  return std::less<void*>()(a, b);
}

int main() {
  char *a = new char;
  char *b = new char;
  cmp(a, b);
}
% clang -g -fsanitize=address -mllvm -asan-detect-invalid-pointer-pair=1
less.cc && ASAN_OPTIONS=detect_invalid_pointer_pairs=1 ./a.out
=================================================================
==4135==ERROR: AddressSanitizer: invalid-pointer-pair: 0x60200000eff0
0x60200000efd0
    #0 0x480733 in std::less<void*>::operator()(void* const&, void* const&)
const
/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6/bits/stl_function.h:236
    #1 0x48025e in cmp(char*, char*) /tmp/less.cc:3
    #2 0x480460 in main /tmp/less.cc:9


Also, AddressSanitizer inserts the instrumentation at a late stage of
instrumentation,
where the pointers may have been transformed into integers or vice versa.

So, we need some cooperation from the frontend to indicate which instructions
would lead to UB if applied to unrelated pointers. 
One way is to add some metadata to such instructions, another way is to
instrument the instructions in the frontend.

Thoughts?</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>