[PATCH] D77696: [Darwin] Teach `AtosSymbolizerProcess` to work on a copy of the environment.

Dan Liew via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 8 19:33:08 PDT 2020


delcypher marked 3 inline comments as done.
delcypher added inline comments.


================
Comment at: compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_mac.cpp:79
 
+  void CopyEnv(char **current_env) {
+    // Count number of elements (excluding nullptr entry).
----------------
kubamracek wrote:
> Can this be moved into sanitizer_mac.cpp or other non-symbolizer file? Make it an actual useful helper that takes the destination buffer as argument, its size, returns true/false.
No. The semantics of this function will change in the next patch that tie it to the needs of this class.  To avoid discussing this in two places let's discuss the change of semantics in https://reviews.llvm.org/D77706.


================
Comment at: compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_mac.cpp:101
+    CHECK_LE(entries_to_copy, kMaxNumEnvPEntries);
+    internal_memcpy(custom_env_, current_env, entries_to_copy * sizeof(uptr));
+    // Null terminate the array.
----------------
kubamracek wrote:
> Is it actually correct to not do a "deep copy" and only copy the pointers? I mean, who "owns" these strings? Are they guaranteed to live forever?
Darwin's Libc setenv implementation owns these strings. When users call `setenv(...)` it makes copies of them which it keeps track of. There are technically some circumstances where this is unsafe.

* setenv() is called on an existing environment variable in the environment array. In some cases Libc will call realloc() on the pointer to the existing string. In the right circumstances this could potentially leave a dangling pointer in our copy of the environment array.
* unsetenv() is called on an existing environment variable in the environment array. This will cause the string to be free'd. Again this could leave a dangling pointer in our copy of the environment array.

So no. They are not guaranteed to live for ever.

Possible solutions:

* Do nothing.
* Implement interceptors to prevent environment changes being made while symbolizer process is being launched. Technically this will never be sufficient because there are APIs to directly access the environment array directly.

Deep copying the strings on its own is actually not a good solution **because it suffers exactly from the same problem we have already**. If the environment gets changed while we're in the middle of copying the strings we could dereference free'd memory. 


================
Comment at: compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_mac.cpp:128
+  // Use statically allocated array to avoid allocation at symbolization time.
+  char *custom_env_[kMaxNumEnvPEntries];
 };
----------------
kubamracek wrote:
> Can you briefly describe how tricky would it be to not have a hardcoded limit? Could an InternalMmapVector be used here instead?
It's not tricky but I don't think it's a good idea. We should avoid trying to do allocation during symbolication because we could be in an out-of-memory situation. The only way to avoid doing allocations at symbolization time is to do it earlier. This patch allocates at the time the symbolizer class is created (because the statically declared array is an instance member) which is during early process init where we should have plenty of memory (if we don't then we have bigger problems).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77696/new/

https://reviews.llvm.org/D77696





More information about the llvm-commits mailing list