[libcxx-commits] [PATCH] D132769: [2b/3][ASan][libcxx] std::basic_string annotations
Vitaly Buka via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Mon Jun 12 13:56:01 PDT 2023
vitalybuka added inline comments.
================
Comment at: libcxx/include/__string/extern_template_lists.h:30
// must never be removed from the stable list.
+#if !defined(_LIBCPP_HAS_NO_ASAN) && (_LIBCPP_CLANG_VER >= 1600)
+// TODO LLVM18: Remove the special-casing (_LIBCPP_CLANG_VER)
----------------
ldionne wrote:
> AdvenamTacet wrote:
> > AdvenamTacet wrote:
> > > ldionne wrote:
> > > > AdvenamTacet wrote:
> > > > > ldionne wrote:
> > > > > > philnik wrote:
> > > > > > > AdvenamTacet wrote:
> > > > > > > > philnik wrote:
> > > > > > > > > Why is this required?
> > > > > > > > We don't really want to use external templates here. Those we cannot control.
> > > > > > > >
> > > > > > > > Without that casing, non-instrumented functions may be used. This leads to false positives. With that change, during compilation, a version with instrumentation is created. I never had a problem of missing instrumentation after that change.
> > > > > > > >
> > > > > > > > Now when I look at it, `&& (_LIBCPP_CLANG_VER >= 1600)` should be removed, same problem may happen with older LLVM.
> > > > > > > IMO the right thing here would be to have an instrumented library, since the same problem exists for any other externally defined function, but I guess I'm OK with this for now. @ldionne are you OK with this?
> > > > > > Using ASAN is an ABI affecting change, kind of like our debug mode. Doing this will only seem to make things work, but in reality it'll cause other problems. For example, if you take a string created inside the dylib (without ASAN) and you try to use it from your ASAN-compiled user program, I assume you'll run into issues, right?
> > > > > >
> > > > > > The solution is (quite unfortunately) to have a different built library for that configuration, unless I missed something.
> > > > > In general, ASan does accept false positives in situation when part of the program is compiled without ASan. That may happen with `std::vector` annotations already. But minimizing the risk of it is a good course of action.
> > > > > I agree that a separate built is the goal solution. I think, it may work as a temporary solution and then be changed into a separate library built. What do you think?
> > > > >
> > > > > But I understand if you disagree, just I'm also not sure what has to be changed here to add an ASan library build to llvm. What has to be done? Where should I look to learn it?
> > > > > Please, let me know.
> > > > I believe this means that `-fsanitize=address` would have to cause a different `libc++.dylib` to be linked. So for example, the driver would add e.g. `-l c++asan` instead of `-l c++`, and we would have to produce `libc++asan.dylib` in addition to `libc++.dylib`. That's not a small change though, since it seems like this might be the first time a sanitizer has an effect on the version of libc++ we have to link against. But it might be the right way to go (probably the only way to go, really). In practice, disabling extern templates like this is only going to give users the impression that things work and then they will run into issues.
> > > >
> > > > CC @vitalybuka
> > > @vitalybuka do you have any suggestion how to implement it?
> > >
> > > Btw. a separate dylib may help with issue described here: https://github.com/llvm/llvm-project/issues/62431 (not sure, tho).
> > Hey @ldionne,
> > could you tell me where I should look to modify it? I don't really know how to start working on that change and I believe, it's the only missing part after landing D148693. I would appreciate letting me know at what files I should look first.
> >
> > Btw. it won't affect the issue mentioned above.
> I think we would want to start with a RFC on Discourse. This would be the first time that we introduce another shared library for libc++ and there are a few things to coordinate:
> 1. We need to build this new version of the library (do we do that with multiple CMake invocations or do we do like compiler-rt and allow building multiple libraries from a single invocation?)
> 2. We need to ship this library as part of the LLVM release and vendors need to also start shipping it with their own releases if they want `-fsanitize=address` to work
> 3. Do we need to also provide a `-fsanitize=address` version of other components in the LLVM stack? For example, if we ship another library that uses libc++ and we want it to work with `-fsanitize=address`, we would likely need to provide a sanitized version of it or else if e.g. a `std::string` is created inside that component and passed to another component that was compiled with `-fsanitize=address`, then you'd get a mismatch.
>
> So this is unfortunately kind of involved, but I think knowing how to do this would benefit us since we have similar needs for other efforts. For example, it would be conceivable to ship an unstable-ABI version of the library, and a hardened version as well, and doing that would follow the exact same steps.
> I believe this means that `-fsanitize=address` would have to cause a different `libc++.dylib` to be linked. So for example, the driver would add e.g. `-l c++asan` instead of `-l c++`, and we would have to produce `libc++asan.dylib` in addition to `libc++.dylib`. That's not a small change though, since it seems like this might be the first time a sanitizer has an effect on the version of libc++ we have to link against. But it might be the right way to go (probably the only way to go, really). In practice, disabling extern templates like this is only going to give users the impression that things work and then they will run into issues.
For msan, tsan, instrumented libc++ was requirement all the time.
For asan, hwasan it's highly recommended.
For all use-case I work, we do that on build system level.
Our sanitizer*bootstrap build bots do that as well.
It would be nice if libc++ is shipped with each version. Sometimes additional flags may affect behavior of sanitizer, but even version instrumented with default flags will be useful.
>
> CC @vitalybuka
> @vitalybuka do you have any suggestion how to implement it?
I don't have specific thoughts on how to implemented.
I guess libc++ could install all sanitized versions with some prefixes: asan/msan/tsan/hwasan
Then the driver will pick one to link.
>
> Btw. a separate dylib may help with issue described here: https://github.com/llvm/llvm-project/issues/62431 (not sure, tho).
================
Comment at: libcxx/include/__string/extern_template_lists.h:30
// must never be removed from the stable list.
+#if !defined(_LIBCPP_HAS_NO_ASAN) && (_LIBCPP_CLANG_VER >= 1600)
+// TODO LLVM18: Remove the special-casing (_LIBCPP_CLANG_VER)
----------------
vitalybuka wrote:
> ldionne wrote:
> > AdvenamTacet wrote:
> > > AdvenamTacet wrote:
> > > > ldionne wrote:
> > > > > AdvenamTacet wrote:
> > > > > > ldionne wrote:
> > > > > > > philnik wrote:
> > > > > > > > AdvenamTacet wrote:
> > > > > > > > > philnik wrote:
> > > > > > > > > > Why is this required?
> > > > > > > > > We don't really want to use external templates here. Those we cannot control.
> > > > > > > > >
> > > > > > > > > Without that casing, non-instrumented functions may be used. This leads to false positives. With that change, during compilation, a version with instrumentation is created. I never had a problem of missing instrumentation after that change.
> > > > > > > > >
> > > > > > > > > Now when I look at it, `&& (_LIBCPP_CLANG_VER >= 1600)` should be removed, same problem may happen with older LLVM.
> > > > > > > > IMO the right thing here would be to have an instrumented library, since the same problem exists for any other externally defined function, but I guess I'm OK with this for now. @ldionne are you OK with this?
> > > > > > > Using ASAN is an ABI affecting change, kind of like our debug mode. Doing this will only seem to make things work, but in reality it'll cause other problems. For example, if you take a string created inside the dylib (without ASAN) and you try to use it from your ASAN-compiled user program, I assume you'll run into issues, right?
> > > > > > >
> > > > > > > The solution is (quite unfortunately) to have a different built library for that configuration, unless I missed something.
> > > > > > In general, ASan does accept false positives in situation when part of the program is compiled without ASan. That may happen with `std::vector` annotations already. But minimizing the risk of it is a good course of action.
> > > > > > I agree that a separate built is the goal solution. I think, it may work as a temporary solution and then be changed into a separate library built. What do you think?
> > > > > >
> > > > > > But I understand if you disagree, just I'm also not sure what has to be changed here to add an ASan library build to llvm. What has to be done? Where should I look to learn it?
> > > > > > Please, let me know.
> > > > > I believe this means that `-fsanitize=address` would have to cause a different `libc++.dylib` to be linked. So for example, the driver would add e.g. `-l c++asan` instead of `-l c++`, and we would have to produce `libc++asan.dylib` in addition to `libc++.dylib`. That's not a small change though, since it seems like this might be the first time a sanitizer has an effect on the version of libc++ we have to link against. But it might be the right way to go (probably the only way to go, really). In practice, disabling extern templates like this is only going to give users the impression that things work and then they will run into issues.
> > > > >
> > > > > CC @vitalybuka
> > > > @vitalybuka do you have any suggestion how to implement it?
> > > >
> > > > Btw. a separate dylib may help with issue described here: https://github.com/llvm/llvm-project/issues/62431 (not sure, tho).
> > > Hey @ldionne,
> > > could you tell me where I should look to modify it? I don't really know how to start working on that change and I believe, it's the only missing part after landing D148693. I would appreciate letting me know at what files I should look first.
> > >
> > > Btw. it won't affect the issue mentioned above.
> > I think we would want to start with a RFC on Discourse. This would be the first time that we introduce another shared library for libc++ and there are a few things to coordinate:
> > 1. We need to build this new version of the library (do we do that with multiple CMake invocations or do we do like compiler-rt and allow building multiple libraries from a single invocation?)
> > 2. We need to ship this library as part of the LLVM release and vendors need to also start shipping it with their own releases if they want `-fsanitize=address` to work
> > 3. Do we need to also provide a `-fsanitize=address` version of other components in the LLVM stack? For example, if we ship another library that uses libc++ and we want it to work with `-fsanitize=address`, we would likely need to provide a sanitized version of it or else if e.g. a `std::string` is created inside that component and passed to another component that was compiled with `-fsanitize=address`, then you'd get a mismatch.
> >
> > So this is unfortunately kind of involved, but I think knowing how to do this would benefit us since we have similar needs for other efforts. For example, it would be conceivable to ship an unstable-ABI version of the library, and a hardened version as well, and doing that would follow the exact same steps.
> > I believe this means that `-fsanitize=address` would have to cause a different `libc++.dylib` to be linked. So for example, the driver would add e.g. `-l c++asan` instead of `-l c++`, and we would have to produce `libc++asan.dylib` in addition to `libc++.dylib`. That's not a small change though, since it seems like this might be the first time a sanitizer has an effect on the version of libc++ we have to link against. But it might be the right way to go (probably the only way to go, really). In practice, disabling extern templates like this is only going to give users the impression that things work and then they will run into issues.
>
> For msan, tsan, instrumented libc++ was requirement all the time.
> For asan, hwasan it's highly recommended.
>
> For all use-case I work, we do that on build system level.
> Our sanitizer*bootstrap build bots do that as well.
>
> It would be nice if libc++ is shipped with each version. Sometimes additional flags may affect behavior of sanitizer, but even version instrumented with default flags will be useful.
>
>
> >
> > CC @vitalybuka
>
> > @vitalybuka do you have any suggestion how to implement it?
>
> I don't have specific thoughts on how to implemented.
> I guess libc++ could install all sanitized versions with some prefixes: asan/msan/tsan/hwasan
> Then the driver will pick one to link.
>
>
> >
> > Btw. a separate dylib may help with issue described here: https://github.com/llvm/llvm-project/issues/62431 (not sure, tho).
>
>
> I believe this means that `-fsanitize=address` would have to cause a different `libc++.dylib` to be linked. So for example, the driver would add e.g. `-l c++asan` instead of `-l c++`, and we would have to produce `libc++asan.dylib` in addition to `libc++.dylib`. That's not a small change though, since it seems like this might be the first time a sanitizer has an effect on the version of libc++ we have to link against. But it might be the right way to go (probably the only way to go, really). In practice, disabling extern templates like this is only going to give users the impression that things work and then they will run into issues.
>
> CC @vitalybuka
================
Comment at: libcxx/include/__string/extern_template_lists.h:30
// must never be removed from the stable list.
+#if !defined(_LIBCPP_HAS_NO_ASAN) && (_LIBCPP_CLANG_VER >= 1600)
+// TODO LLVM18: Remove the special-casing (_LIBCPP_CLANG_VER)
----------------
vitalybuka wrote:
> vitalybuka wrote:
> > ldionne wrote:
> > > AdvenamTacet wrote:
> > > > AdvenamTacet wrote:
> > > > > ldionne wrote:
> > > > > > AdvenamTacet wrote:
> > > > > > > ldionne wrote:
> > > > > > > > philnik wrote:
> > > > > > > > > AdvenamTacet wrote:
> > > > > > > > > > philnik wrote:
> > > > > > > > > > > Why is this required?
> > > > > > > > > > We don't really want to use external templates here. Those we cannot control.
> > > > > > > > > >
> > > > > > > > > > Without that casing, non-instrumented functions may be used. This leads to false positives. With that change, during compilation, a version with instrumentation is created. I never had a problem of missing instrumentation after that change.
> > > > > > > > > >
> > > > > > > > > > Now when I look at it, `&& (_LIBCPP_CLANG_VER >= 1600)` should be removed, same problem may happen with older LLVM.
> > > > > > > > > IMO the right thing here would be to have an instrumented library, since the same problem exists for any other externally defined function, but I guess I'm OK with this for now. @ldionne are you OK with this?
> > > > > > > > Using ASAN is an ABI affecting change, kind of like our debug mode. Doing this will only seem to make things work, but in reality it'll cause other problems. For example, if you take a string created inside the dylib (without ASAN) and you try to use it from your ASAN-compiled user program, I assume you'll run into issues, right?
> > > > > > > >
> > > > > > > > The solution is (quite unfortunately) to have a different built library for that configuration, unless I missed something.
> > > > > > > In general, ASan does accept false positives in situation when part of the program is compiled without ASan. That may happen with `std::vector` annotations already. But minimizing the risk of it is a good course of action.
> > > > > > > I agree that a separate built is the goal solution. I think, it may work as a temporary solution and then be changed into a separate library built. What do you think?
> > > > > > >
> > > > > > > But I understand if you disagree, just I'm also not sure what has to be changed here to add an ASan library build to llvm. What has to be done? Where should I look to learn it?
> > > > > > > Please, let me know.
> > > > > > I believe this means that `-fsanitize=address` would have to cause a different `libc++.dylib` to be linked. So for example, the driver would add e.g. `-l c++asan` instead of `-l c++`, and we would have to produce `libc++asan.dylib` in addition to `libc++.dylib`. That's not a small change though, since it seems like this might be the first time a sanitizer has an effect on the version of libc++ we have to link against. But it might be the right way to go (probably the only way to go, really). In practice, disabling extern templates like this is only going to give users the impression that things work and then they will run into issues.
> > > > > >
> > > > > > CC @vitalybuka
> > > > > @vitalybuka do you have any suggestion how to implement it?
> > > > >
> > > > > Btw. a separate dylib may help with issue described here: https://github.com/llvm/llvm-project/issues/62431 (not sure, tho).
> > > > Hey @ldionne,
> > > > could you tell me where I should look to modify it? I don't really know how to start working on that change and I believe, it's the only missing part after landing D148693. I would appreciate letting me know at what files I should look first.
> > > >
> > > > Btw. it won't affect the issue mentioned above.
> > > I think we would want to start with a RFC on Discourse. This would be the first time that we introduce another shared library for libc++ and there are a few things to coordinate:
> > > 1. We need to build this new version of the library (do we do that with multiple CMake invocations or do we do like compiler-rt and allow building multiple libraries from a single invocation?)
> > > 2. We need to ship this library as part of the LLVM release and vendors need to also start shipping it with their own releases if they want `-fsanitize=address` to work
> > > 3. Do we need to also provide a `-fsanitize=address` version of other components in the LLVM stack? For example, if we ship another library that uses libc++ and we want it to work with `-fsanitize=address`, we would likely need to provide a sanitized version of it or else if e.g. a `std::string` is created inside that component and passed to another component that was compiled with `-fsanitize=address`, then you'd get a mismatch.
> > >
> > > So this is unfortunately kind of involved, but I think knowing how to do this would benefit us since we have similar needs for other efforts. For example, it would be conceivable to ship an unstable-ABI version of the library, and a hardened version as well, and doing that would follow the exact same steps.
> > > I believe this means that `-fsanitize=address` would have to cause a different `libc++.dylib` to be linked. So for example, the driver would add e.g. `-l c++asan` instead of `-l c++`, and we would have to produce `libc++asan.dylib` in addition to `libc++.dylib`. That's not a small change though, since it seems like this might be the first time a sanitizer has an effect on the version of libc++ we have to link against. But it might be the right way to go (probably the only way to go, really). In practice, disabling extern templates like this is only going to give users the impression that things work and then they will run into issues.
> >
> > For msan, tsan, instrumented libc++ was requirement all the time.
> > For asan, hwasan it's highly recommended.
> >
> > For all use-case I work, we do that on build system level.
> > Our sanitizer*bootstrap build bots do that as well.
> >
> > It would be nice if libc++ is shipped with each version. Sometimes additional flags may affect behavior of sanitizer, but even version instrumented with default flags will be useful.
> >
> >
> > >
> > > CC @vitalybuka
> >
> > > @vitalybuka do you have any suggestion how to implement it?
> >
> > I don't have specific thoughts on how to implemented.
> > I guess libc++ could install all sanitized versions with some prefixes: asan/msan/tsan/hwasan
> > Then the driver will pick one to link.
> >
> >
> > >
> > > Btw. a separate dylib may help with issue described here: https://github.com/llvm/llvm-project/issues/62431 (not sure, tho).
> >
> >
> > I believe this means that `-fsanitize=address` would have to cause a different `libc++.dylib` to be linked. So for example, the driver would add e.g. `-l c++asan` instead of `-l c++`, and we would have to produce `libc++asan.dylib` in addition to `libc++.dylib`. That's not a small change though, since it seems like this might be the first time a sanitizer has an effect on the version of libc++ we have to link against. But it might be the right way to go (probably the only way to go, really). In practice, disabling extern templates like this is only going to give users the impression that things work and then they will run into issues.
> >
> > CC @vitalybuka
>
>
> I think we would want to start with a RFC on Discourse. This would be the first time that we introduce another shared library for libc++ and there are a few things to coordinate:
> 1. We need to build this new version of the library (do we do that with multiple CMake invocations or do we do like compiler-rt and allow building multiple libraries from a single invocation?)
Single invocation would be nice, as it's better scale on other sanitizers.
> 2. We need to ship this library as part of the LLVM release and vendors need to also start shipping it with their own releases if they want `-fsanitize=address` to work
please include msan/hwasan/tsan versions into the plan
lsan/ubsan do not needed special libc++.
dfsan is close to msan, but it's very specific tool, so we can skip for now.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D132769/new/
https://reviews.llvm.org/D132769
More information about the libcxx-commits
mailing list