[PATCH] D133268: [tsan] Replace mem intrinsics with calls to interceptors

Marco Elver via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 5 10:29:42 PDT 2022


melver added inline comments.


================
Comment at: llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp:344
   MemmoveFn =
-      M.getOrInsertFunction("memmove", Attr, IRB.getInt8PtrTy(),
+      M.getOrInsertFunction("__interceptor_memmove", Attr, IRB.getInt8PtrTy(),
                             IRB.getInt8PtrTy(), IRB.getInt8PtrTy(), IntptrTy);
----------------
vitalybuka wrote:
> melver wrote:
> > This will cause problems for non-TSan runtimes, like KCSAN (on Linux, but also Fuchsia, FreeBSD, etc.).
> > 
> > I also don't understand why this happens in the first place, isn't ThreadSanitizer pass supposed to run after such transforms?
> > 
> > Some options in order of preference:
> > 
> > 1. Is there another way to suppress this "optimization" back to an intrinsic? InstCombine has some exceptions when not to touch a call: https://github.com/llvm/llvm-project/blob/c4a174be9190b20eff0334415b5db7b2e2e204aa/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp#L2808
> > So the function calls could be 'notail' AFAIK (I think llvm/test/Transforms/InstCombine/memcpy-1.ll is missing a test for that, it's only testing 'musttail'). Or a new attribute that's added to tryOptimizeCall().
> > 
> > 2. Make the prefix configurable with an option, so we can just set it to "" or `"__tsan_"` in the kernel (because `__interceptor_` seems too generic). It might be nice to mirror what ASan and MSan does, and actually call them `__tsan_mem*`, but it requires a runtime change as well.
> > This will cause problems for non-TSan runtimes, like KCSAN (on Linux, but also Fuchsia, FreeBSD, etc.).
> > 
> > I also don't understand why this happens in the first place, isn't ThreadSanitizer pass supposed to run after such transforms?
> 
> This is not happening yet. 
> 
> I want to move sanitizers to early stages. Now we have simplification -> optimization -> sanitizers. I believe it should be simplification -> sanitizers -> optimization.
> 
> There is no reasons, other than bugs like this, why we don't apply optimization on instrumented code. It will improve size by 10% (probably also performance, I didn't measure perf yet). I mostly focused on Msan/Asan but other will benefit as well.
> 
> Btw. with ThinLTO (maybe full LTO as well) optimization passes already happen after sanitizers anyway. We don't test/use it much, but it should miss races on mem instructions.
> 
> > 
> > Some options in order of preference:
> > 
> > 1. Is there another way to suppress this "optimization" back to an intrinsic? InstCombine has some exceptions when not to touch a call: https://github.com/llvm/llvm-project/blob/c4a174be9190b20eff0334415b5db7b2e2e204aa/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp#L2808
> > So the function calls could be 'notail' AFAIK (I think llvm/test/Transforms/InstCombine/memcpy-1.ll is missing a test for that, it's only testing 'musttail'). Or a new attribute that's added to tryOptimizeCall().
> 
> I don't like this approach as it will leak a lot of sanitizers into other passes.
> Fixing passes for santizers should be the last resort, we should rather try to generate code which will not be broken by valid transformation. Here prefixes is an easy solution.
> 
> > 
> > 2. Make the prefix configurable with an option, so we can just set it to "" or `"__tsan_"` in the kernel (because `__interceptor_` seems too generic). It might be nice to mirror what ASan and MSan does, and actually call them `__tsan_mem*`, but it requires a runtime change as well.
> 
> I can do that, but it will be an issue for kernel, as we will optimize them out.
> I think prefix (everywhere) and proper callback in the way to go, like we don in Msan.
> 
> I don't insist on __interceptor_, just thought that maybe we can use existing API.
> 
> Also I noticed ClKasanMemIntrinCallbackPrefix in asan, I propose to make it default, and maybe the only behavior. 
> 
> CC @glider 
> Also I noticed ClKasanMemIntrinCallbackPrefix in asan, I propose to make it default, and maybe the only behavior.

I added that because some kernel folks were complaining about compiler generating mem*() calls in uninstrumented functions, and mem*() being instrumented.

But KASAN isn't ready yet, so we can't make it the default. The kernel will flip the flag when it's been implemented, otherwise we just break older kernel unnecessarily. Also for KASAN, we need to support gcc, which does not emit `__asan_mem*`, and likely won't any time soon. So with the command line option we can dynamically detect if the compiler supports the prefix or not and then produce slightly different ways of instrumenting mem*() functions.

For the ThreadSanitizer pass change, since we don't distinguish between kernel and user space, there's not much we can do then other than break older kernels. I will ask stable kernel maintainers to backport the eventual fix.

I support the consistent prefix, i.e. `__asan_`, `__msan_`, and `__tsan_`.

Since it requires a kernel fix for KCSAN anyway (cmdline option or not), I think we can leave out the configurable option.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133268/new/

https://reviews.llvm.org/D133268



More information about the llvm-commits mailing list