[libcxx-commits] [libcxx] [ASan][libc++] Turn on ASan annotations for short strings (PR #79536)

Mon Jul 22 05:26:26 PDT 2024

AdvenamTacet wrote:

Hey,
first of all, thank you very much @nico for the feedback, it's really helpful to know experience of other users and I really appreciate that!
I'm back home and I'm starting to look what I can commit to improve experience. Bellow are my observations.

---

First of all, what you already pointed out, (3) is [a known issue since creation of ASan vector annotations](https://github.com/google/sanitizers/wiki/AddressSanitizerContainerOverflow#false-positives). Container annotations (for all containers) don't work when only part of the code is compiled with ASan and therefore it's possible to turn off container annotations in runtime with `ASAN_OPTIONS=detect_container_overflow=0)` environment variable.
However, I see why it may happen more often with strings than with other containers and it looks like it may be even more more visible with short strings based on your case.

We removed flag allowing to turn off short string annotations with reasoning that we don't want to force users to know that under the hood we have something like SSO. Adding it back is a five seconds work. We can add it and never mention it anywhere, therefore only people who read the file (and therefore are aware of SSO) know about it.

Alternatively, (and I consider it much better solution, at least at the moment of writing it) is adding  something like `ASAN_OPTIONS=detect_container_overflow=0` for short strings, maybe also strings in general. That would allow turning it off without recompiling everything and still work when we start to ship annotated libc++. I'm inclined to implement it.

If we have a consensus about implementing one or another, I'm happy to sit to it.

Complex solution to (3) would require big changes or additional code in non-ASan builds. I don't think anyone is willing to accept that and I don't support it.

---

Regarding the (2), I agree with @ldionne and have nothing to add. (4) is very similar in nature.

---

I think you have not-optimal workaround for (4). I will address that in https://github.com/llvm/llvm-project/pull/91702 but for the time being I see two better solutions.

- If after `memcpy` purpose of the source memory area is changed, just unpoison whole source area before the call to memcpy. You can do it with [`ASAN_UNPOISON_MEMORY_REGION` macro](https://github.com/google/sanitizers/wiki/AddressSanitizerManualPoisoning).

- If after memcpy string objects in source memory area are still in use, just copy the memory with turned off instrumentation (you can turn off instrumentation with `__attribute__((__no_sanitize__("address")))`). At the moment it's probably best to create own `asan_unsafe_memcpy` with turned off instrumentation.

I hope to have a better solution soon, but  https://github.com/llvm/llvm-project/pull/91702 turned out to be extremely time consuming and I was forced to postpone working on it for a while, but I hope to commit that whole week to open source and to finish it soon.

At the same time, I think ASan should detect that, we should only make it easier for developers to tell "I understand that this memory operation is not safe".

---

I'm really happy about (1), if you have more data about recent ASan container annotations changes (string annotations/deque annotations/annotations of non-default allocators) and can share them, it would be extremely valuable for me and I would very appreciate that. I'm interested in any and every piece of data you can share.

--- 

In general, I'm very strongly in favor of enabling all annotations by default. But we can make disabling some annotations easier.
It's much easier to tell "Hey, you can turn off those annotations" when user hits error, than inform users that there is a possibility to enable more annotations.

All ASan errors mentioned here should be reported by ASan, imho. As mention before, I think we should only make it easier for developers to tell "I understand that this memory operation is not safe". And I plan to address it. I don't want to commit to finalize it this week, but I will try to do it asap.

I think, now the best thing we can do is adding runtime ASan options to turn off string annotations and (maybe) short string annotations.
I understand the argument "users shouldn't have to know about SSO", additionally same problems may happen with all containers including long string annotations as described in my comment to (3)... but if in practice only short string annotations are problematic, I think we should address it somehow.
I support creating a runtime option to turn off only short string annotations and string annotations in general, suggesting the latter for all users [who don't know about SSO]. I believe there is big enough group of people who understand SSO and use ASan that this additional option would be helpful. But I understand if we decide against that granularity.
If not, the option to turn off string annotations in general (and in runtime) seems like a good thing to me.

Thank you for your feedback again!

https://github.com/llvm/llvm-project/pull/79536