[libc-commits] [PATCH] D77949: [libc] Add SIMD strlen implementation .
Mitch Phillips via Phabricator via libc-commits
libc-commits at lists.llvm.org
Tue Apr 14 13:31:25 PDT 2020
hctim added a comment.
I guess this is one of those times where we would have to give up correctness for speed. Quick question - do we have performance numbers that indicate this is faster? I did some quick runs on quick-bench and found it had improvements, but was surprised that the compiler didn't produce a `repnz` for the old strlen variant...
Bearing in mind to keep the LLVM libc goal here of being sanitizer-friendly, you'd have to implement the sanitizer behaviour yourself. It would basically be:
1. Foreach bounds-based sanitizer: `-fsanitize=address, -fsanitize=bounds, -fsanitize=memory, -fsanitize=hwasan, -fsanitize=memtag`:
- Mark the function as unsanitizeable: `__attribute__(no_sanitize("address"))`
- Guard the following code behind an `if has_feature(address_sanitizer):` (note that the interfaces will slightly differ depending on the sanitizer)
1. Check the shadow before reading each word (`__asan_region_is_poisoned(ptrPtr, sizeof(uintptr_t))`)
2. If the string still looks like it should continue, but the sanitizer reported that it's OOB, you should trap (`__asan_report_error`)
3. If the string terminates, and the sanitizer reported that it's OOB, you should check that the string termination point is strictly less than the sanitizer OOB point (fallback to a single-byte slow path using `__asan_address_is_poisoned`)
4. If the string terminates, and sanitizers don't report OOB, continue as usual.
Alternatively you could range-check based on the return value, which would probably be simpler (this is what we do with the interceptors currently, see `sanitizer_common_interceptors.inc:364`).
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D77949/new/
https://reviews.llvm.org/D77949
More information about the libc-commits
mailing list