[PATCH] D63469: hwasan: Teach the runtime to identify the local variable being accessed in UAR reports.

Peter Collingbourne via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 24 16:45:50 PDT 2019


pcc added inline comments.


================
Comment at: compiler-rt/lib/hwasan/hwasan_report.cpp:281
+
+      if (found_local)
+        return;
----------------
eugenis wrote:
> I think it might be useful to iterate until the end of the buffer to find more candidates; if anything, that would avoid losing the information that's already there.
> 
> This is good enough for now. Some time in the future I think we should refactor this to collect all the candidates (including ex. potential buffer overflows with larger offsets, across multiple small allocation) and sort them by likelihood.
> 
By the time we get here, we've already visited every frame. We don't stop when we find the first match or anything like that.

Or are you saying that we should print the `Previously allocated frames` section in all cases (e.g. for manual identification of OOB accesses)? 


================
Comment at: compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_internal.h:18
 #include "sanitizer_file.h"
+#include "sanitizer_vector.h"
 
----------------
eugenis wrote:
> is this include necessary here?
No, removed.


================
Comment at: compiler-rt/test/hwasan/TestCases/stack-uar.c:36
   // NOSYM: Previously allocated frames:
-  // NOSYM-NEXT: sp: 0x{{.*}} #0 0x{{.*}} ({{.*}}/stack-uar.c.tmp+0x{{.*}}){{$}}
-  // NOSYM-NEXT: 16 CCC;
+  // NOSYM-NEXT: record_addr:0x{{.*}} record:0x{{.*}} ({{.*}}/stack-uar.c.tmp+0x{{.*}}){{$}}
+  // NOSYM-NEXT: record_addr:0x{{.*}} record:0x{{.*}} ({{.*}}/stack-uar.c.tmp+0x{{.*}}){{$}}
----------------
eugenis wrote:
> The advantage of printing decoded records, as before, is that they can be parsed with all the old symbolization scripts. Explicitly listing the last digits of FP is also sometimes useful to match against register values in the report.
> 
> Could we change the format to
> 
> [tag], fp:0x12345 #0 PC (module+offset)  
> 
> #0 is also for compatibility with symbolization scripts.
Do you think it's important to be compatible with old symbolization scripts here? I think we will probably need a new symbolization script anyway that does all the same things that the runtime is doing to identify stack objects. That's also why I decided to print raw values here, so that we'd be less likely to need to change the line format in the future if the entry format changes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63469/new/

https://reviews.llvm.org/D63469





More information about the llvm-commits mailing list