[all-commits] [llvm/llvm-project] a83f8e: [libc++][hardening] Check bounds on arithmetic in ...

David Benjamin via All-commits all-commits at lists.llvm.org
Mon Mar 11 20:41:09 PDT 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: a83f8e0314fcdda162e54cbba1c9dcf230dff093
      https://github.com/llvm/llvm-project/commit/a83f8e0314fcdda162e54cbba1c9dcf230dff093
  Author: David Benjamin <davidben at google.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M libcxx/include/__iterator/bounded_iter.h
    A libcxx/test/libcxx/containers/views/views.span/assert.iterator-indexing.pass.cpp
    R libcxx/test/libcxx/containers/views/views.span/debug.iterator-indexing.pass.cpp
    M libcxx/test/libcxx/iterators/bounded_iter/dereference.pass.cpp
    A libcxx/test/libcxx/strings/string.view/string.view.iterators/assert.iterator-indexing.pass.cpp
    R libcxx/test/libcxx/strings/string.view/string.view.iterators/debug.iterator-indexing.pass.cpp

  Log Message:
  -----------
  [libc++][hardening] Check bounds on arithmetic in __bounded_iter (#78876)

Previously, `__bounded_iter` only checked `operator*`. It allowed the
pointer to go out of bounds with `operator++`, etc., and relied on
`operator*` (which checked `begin <= current < end`) to handle
everything. This has several unfortunate consequences:

First, pointer arithmetic is UB if it goes out of bounds. So by the time
`operator*` checks, it may be too late and the optimizer may have done
something bad. Checking both operations is safer.

Second, `std::copy` and friends currently bypass bounded iterator
checks. I think the only hope we have to fix this is to key on `iter +
n` doing a check. See #78771 for further discussion. Note this PR is not
sufficient to fix this. It adds the output bounds check, but ends up
doing it after the `memmove`, which is too late.

Finally, doing these checks is actually *more* optimizable. See #78829,
which is fixed by this PR. Keeping the iterator always in bounds means
`operator*` can rely on some invariants and only needs to check `current
!= end`. This aligns better with common iterator patterns, which use
`!=` instead of `<`, so it's easier to delete checks with local
reasoning.

See https://godbolt.org/z/vEWrWEf8h for how this new `__bounded_iter`
impacts compiler output. The old `__bounded_iter` injected checks inside
the loops for all the `sum()` functions, which not only added a check
inside a loop, but also impeded Clang's vectorization. The new
`__bounded_iter` allows all the checks to be optimized out and we emit
the same code as if it wasn't here.

Not everything is ideal however. `add_and_deref` ends up emitting two
comparisons now instead of one. This is because a missed optimization in
Clang. I've filed #78875 for that. I suspect (with no data) that this PR
is still a net performance win because impeding ranged-for loops is
particularly egregious. But ideally we'd fix the optimizer and make
`add_and_deref` fine too.

There's also something funny going on with `std::ranges::find` which I
have not yet figured out yet, but I suspect there are some further
missed optimization opportunities.

Fixes #78829.

(CC @danakj)



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list