[LLVMbugs] [Bug 22871] New: Crash when evaluating set s.end() with "-fsanitize=undefined-trap -fsanitize-undefined-trap-on-error" on release build

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Tue Mar 10 13:25:49 PDT 2015


https://llvm.org/bugs/show_bug.cgi?id=22871

            Bug ID: 22871
           Summary: Crash when evaluating set s.end() with
                    "-fsanitize=undefined-trap
                    -fsanitize-undefined-trap-on-error" on release build
           Product: clang
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: LLVM Codegen
          Assignee: unassignedclangbugs at nondot.org
          Reporter: bruno.cardoso at gmail.com
                CC: ganna at apple.com, llvmbugs at cs.uiuc.edu,
                    mclow.lists at gmail.com, rjmccall at apple.com
    Classification: Unclassified

Given t.cpp:

#include <set>
int main(int argc, const char *argv[]) {
  std::set<int> s;
  for (auto i = s.begin(); i != s.end(); ++i) {}
  return 0;
}

$ clang++ -fsanitize=undefined-trap -fsanitize-undefined-trap-on-error -g
-std=c++11 -O1 t.cpp -o t; ./t
Illegal instruction: 4

or

$ clang++ -fsanitize=undefined-trap -g -std=c++11 -O1 t.cpp -o t; ./t
.../../include/c++/v1/__tree:826:16: runtime error: downcast of address
0x7fff51d9aa00 with insufficient space for an object of type
'std::__1::__tree_node<int, void *>'
0x7fff51d9aa00: note: pointer points here
 ff 7f 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00
00 00 00  00 00 00 00
              ^ 

This used clang+llvm TOT.

*** DESCRIPTION

This looks like undefined behavior in std::set.

// std::set<T>::end() looks like this.  __tree is a std::set<T>.
iterator end() {return __tree_.end();}

// std::__tree<T>::end() looks like this.
iterator end() {return iterator(__end_node());}

// std::__tree<T>::__end_node() looks like this, with some template stuff
stripped.
// __node_pointer is a typedef for std::__tree_node<T>*.
__node_pointer __end_node() {
  return static_cast<__node_pointer>(&__pair1_.first());
}

// Here are the fields of std::__tree<T>.  __compressed_pair is just
// a neat trick to avoid using storage for empty classes, which both
// __node_allocator and the value comparator are in this case.
// So, in terms of raw storage, this is basically a __node_pointer
// (i.e. __tree_node<T>*), an __end_node_t (i.e. __tree_end_node), and
// a size_type (i.e. size_t).  Note that, the
// __tree_end_node is 8 bytes into a 24-byte struct.
    __node_pointer                                          __begin_node_;
    __compressed_pair<__end_node_t, __node_allocator>  __pair1_;
    __compressed_pair<size_type, value_compare>        __pair3_;

// __tree_node<T> looks like this, again with some extra template stuff
removed:
template <class _Tp>
class __tree_node : public __tree_node_base {
public:
    typedef __tree_node_base<_VoidPtr> base;
    typedef _Tp value_type;

    value_type __value_;
};

// Here's __tree_node_base, as always with a ton of cruft removed:
class __tree_node_base : public __tree_end_node {
public:
    typedef __tree_node_base *pointer;
    typedef __tree_end_node<pointer> base;

    pointer __right_;
    pointer __parent_;
    bool __is_black_;
};

// Finally, here's __tree_end_node.
class __tree_end_node {
public:
    void *__left_;
};

So, std::set<T>::end() static_casts the address of a __tree_end_node down to a
__tree_node<T>*, which is undefined behavior because a __tree_end_node is not
actually a __tree_node<T>.

Therefore, the code generated for this uses @llvm.objectsize in a wrong way,
leading to runtime crash or a compile time generated trap (in case
-fsanitize=undefined-trap is used). This won't trigger in -O0 since in this
scenario @llvm.objectsize is always folded to "-1".

llvm.objectsize is not very smart, and I don't think it knows about the extents
of fields within an object.  So if there were 32 bytes of storage in a
std::set<T> starting from the offset of the __tree_end_node, UBSan would not be
able to detect this.  But there aren't; there's just 16 bytes, for the
__tree_end_node and the size_t.

This is all innocuous, of course.  Nothing is ever supposed to use any field in
the end node except "left"; the iterator just wants the more specific type to
remove a ton of annoying casts.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150310/a07c3de0/attachment.html>


More information about the llvm-bugs mailing list