[libc-commits] [PATCH] D78612: [libc] Add sanitizer instrumentation to the SIMD strlen implementation.

Siva Chandra via Phabricator via libc-commits libc-commits at lists.llvm.org
Wed Apr 22 09:12:52 PDT 2020


sivachandra added a subscriber: phosek.
sivachandra added a comment.

In D78612#1996257 <https://reviews.llvm.org/D78612#1996257>, @abrachet wrote:

> I'm not sure if this discussion belongs here or D77949 <https://reviews.llvm.org/D77949>, but as this complicates that patch even more I will do so here.
>
> I really wonder if the added complexity is worth the speedup we are hoping to achieve. Firstly even if this implements ASan which is by far the most common and ubiquitously useful, it makes sense to implement it here, we also have fuzz tests set up which rely on `-fsanitize=fuzzer`. That won't be implemented here. The whole appeal of sanitizers is that nothing needs to change in our code, but it's not true here no matter what we do if we rely on UB.
>
> This patch is well done and it's very clever and easy to understand way to implement by hand what `-fsanitize=address` is doing. But it is undeniably making this much less maintainable. Is the cost we are paying in maintainability worth the speedup we are hoping to get using this implementation? If this patch is only for `strlen` and not an example for how we will handle all such cases which rely on UB for clever tricks, then I would say it is not worth it.


Yes, the idea is to use `strlen` as the first example. The approach we want to take with LLVM libc is to hand craft the instrumentation vs using interceptors as is done for other libcs. So, we want to start learning about such hand-crafted instrumentation using this "simple" example.

About other sanitizers, that is a very valid question. I want @hctim or other sanitizer people to tell us more. FWIW, I heard of informal interest from glibc developers also about using sanitizer instrumentation. Currently, each sanitizer has its own interface API. I would really prefer that all of them are unified in some manner to make them simpler to use. So, my personal view is that we might be heading towards a more generalized/standardized sanitizer API.

> FWIW all of that would be moot if this was performance critical, but I'm not convinced that `strlen` is a particularly hot function, especially given its use outside of C is very limited. Even in C++ it would be used much less frequently and for programs written in other languages, that would be even less than that is my guess. Even if it were a hot function, how large does the string need to be before the implementation in D77949 <https://reviews.llvm.org/D77949> is better than what we currently have?

I do not have any benchmarks to share or point to wrt `strlen` particularly. I have done some asking around, and the feedback I received is that we should match at least glibc's performance.

Also adding @phosek who can share his view/opinion about this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78612/new/

https://reviews.llvm.org/D78612





More information about the libc-commits mailing list