[PATCH] D59725: Additions to creduce script
Alexander Richardson via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Mar 26 02:26:34 PDT 2019
arichardson added inline comments.
================
Comment at: clang/utils/creduce-clang-crash.py:145
+ matches = re.findall(stacktrace_re, crash_output)
+ result = filter(lambda x: x and x.strip() not in filters, matches)[:5]
+ for msg in result:
----------------
george.burgess.iv wrote:
> nit: please prefer `[x for x in matches if x and x.strip() not in filters][:5]`. py3's filter returns a generator, whereas py2's returns a list.
Stack traces also look different on macOS and it would be nice to handle that too.
Here's a sample I got from adding a llvm_unreachable at a random location:
```
My unreachable message...
UNREACHABLE executed at /Users/alex/cheri/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp:4468!
Stack dump:
0. Program arguments: /Users/alex/cheri/llvm-project/cmake-build-debug/bin/opt -mtriple=cheri-unknown-freebsd -mcpu=cheri128 -mattr=+cheri128 -target-abi purecap -relocation-model pic -S -instcombine -simplifycfg /Users/alex/cheri/llvm-project/llvm/test/CodeGen/Mips/cheri/simplifycfg-ptrtoint.ll -o -
1. Running pass 'Function Pass Manager' on module '/Users/alex/cheri/llvm-project/llvm/test/CodeGen/Mips/cheri/simplifycfg-ptrtoint.ll'.
2. Running pass 'Combine redundant instructions' on function '@cannot_fold_tag_unknown'
0 libLLVMSupport.dylib 0x0000000114515a9d llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 45
1 libLLVMSupport.dylib 0x00000001145153f1 llvm::sys::RunSignalHandlers() + 65
2 libLLVMSupport.dylib 0x0000000114515fbf SignalHandler(int) + 111
3 libsystem_platform.dylib 0x00007fff5b637b3d _sigtramp + 29
4 libsystem_platform.dylib 0x00007ffee20d0cf0 _sigtramp + 2259259856
5 libsystem_c.dylib 0x00007fff5b4f51c9 abort + 127
6 libLLVMSupport.dylib 0x000000011446bb12 llvm::llvm_unreachable_internal(char const*, char const*, unsigned int) + 162
7 libLLVMInstCombine.dylib 0x0000000112c345c8 llvm::InstCombiner::foldICmpUsingKnownBits(llvm::ICmpInst&) + 4136
8 libLLVMInstCombine.dylib 0x0000000112c34d19 llvm::InstCombiner::visitICmpInst(llvm::ICmpInst&) + 569
9 libLLVMInstCombine.dylib 0x0000000112bb9cf2 llvm::InstCombiner::run() + 1522
10 libLLVMInstCombine.dylib 0x0000000112bbb310 combineInstructionsOverFunction(llvm::Function&, llvm::InstCombineWorklist&, llvm::AAResults*, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::DominatorTree&, llvm::OptimizationRemarkEmitter&, bool, llvm::LoopInfo*) + 624
11 libLLVMInstCombine.dylib 0x0000000112bbb6d6 llvm::InstructionCombiningPass::runOnFunction(llvm::Function&) + 214
12 libLLVMCore.dylib 0x0000000111c0bb4d llvm::FPPassManager::runOnFunction(llvm::Function&) + 317
13 libLLVMCore.dylib 0x0000000111c0be83 llvm::FPPassManager::runOnModule(llvm::Module&) + 99
14 libLLVMCore.dylib 0x0000000111c0c2c4 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) + 420
15 libLLVMCore.dylib 0x0000000111c0c036 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 182
16 opt 0x000000010db6657b main + 7163
17 libdyld.dylib 0x00007fff5b44ced9 start + 1
```
================
Comment at: clang/utils/creduce-clang-crash.py:362
+ r = Reduce(crash_script, file_to_reduce)
+ r.simplify_clang_args()
+ r.write_interestingness_test()
----------------
george.burgess.iv wrote:
> akhuang wrote:
> > george.burgess.iv wrote:
> > > I'm unclear on why we do a partial simplification of clang args here, then a full reduction after creduce. Is there a disadvantage to instead doing:
> > >
> > > r.write_interestingness_test()
> > > r.clang_preprocess()
> > > r.reduce_clang_args()
> > > r.run_creduce()
> > > r.reduce_clang_args()
> > >
> > > ?
> > >
> > > The final `reduce_clang_args` being there to remove extra `-D`/`-I`/etc. args if preprocessing failed somehow, to remove `-std=foo` args if those aren't relevant after reduction, etc.
> > >
> > > The advantage to this being that we no longer need to maintain a `simplify` function, and creduce runs with a relatively minimal set of args to start with.
> > >
> > > In any case, can we please add comments explaining why we chose whatever order we end up going with, especially where subtleties make it counter to what someone might naively expect?
> > Basically the disadvantage is that clang takes longer to run before the reduction, so it takes unnecessary time to iterate through all the arguments beforehand.
> > And yeah, the final `reduce_clang_args` is there to minimize the clang arguments needed to reproduce the crash on the reduced source file.
> >
> > If that makes sense, I can add comments for this-
> Eh, I don't have a strong opinion here. My inclination is to prefer a simpler reduction script if the difference is len(args) clang invocations, since I assume creduce is going to invoke clang way more than len(args) times. I see a docstring detailing that `simplify` is only meant to speed things up now, so I'm content with the way things are.
I think it makes sense to remove some clang args before running creduce since removal of some flags can massively speed up reduction later (e.g. not emitting debug info or compiling at -O0, only doing -emit-llvm, etc) if they avoid expensive optimizations that don't cause the crash.
================
Comment at: clang/utils/creduce-clang-crash.py:303
+ opts_startswith=["-O"])
+ self.clang_args = new_args
+ verbose_print("Simplified command:", quote_cmd(self.get_crash_cmd()))
----------------
george.burgess.iv wrote:
> FWIW, opportunistically trying to add `-fsyntax-only` may help here: if the crash is in the frontend, it means that non-repros will stop before codegen, rather than trying to generate object files, or whatever they were trying to generate in the first place.
Yes that sounds like a good idea! I just do -emit-llvm to avoid assembly output but for parser/sema crashes -fsyntax-only would save some time.
Another one I found useful was `-Werror=implicit-int` to get more readable test cases from creduce: https://github.com/CTSRD-CHERI/llvm-project/blob/master/clang/utils/creduce_crash_testcase.py#L851
Without that flag lots of test cases look really weird due to the implicit int and various inferred semicolons.
================
Comment at: clang/utils/creduce-clang-crash.py:64
+
+class Reduce:
+ def __init__(self, crash_script, file_to_reduce):
----------------
Does this need to be `Reduce(object):` for python2?
================
Comment at: clang/utils/creduce-clang-crash.py:123
+ # Look for specific error messages
+ regexes = [r"Assertion `(.+)' failed",
+ r"fatal error: backend error: (.+)",
----------------
Some operating systems use a different assertion format (see my reduce script: https://github.com/CTSRD-CHERI/llvm-project/blob/master/clang/utils/creduce_crash_testcase.py#L662)
For MacOS/FreeBSD we need to also handle `r"Assertion failed: \(.+\),"`. Over the past two years I have also had cases where the other message formats have been useful so I would quite like to see them added here as well.
================
Comment at: clang/utils/creduce-clang-crash.py:205
+ p = subprocess.Popen(cmd,
+ stdout=subprocess.PIPE,
+ stderr=subprocess.STDOUT)
----------------
If we are writing the preprocessed output to that tempfile anyway, we could use `stdout=tmpfile`?
For python3 this would be simpler with `subprocess.check_call()` but I'm not sure python 2.7 has it.
================
Comment at: clang/utils/creduce-clang-crash.py:310
+ # Remove other cases that aren't covered by the heuristic
+ new_args = self.try_remove_args(new_args, msg="Removed -mllvm",
+ opts_one_arg_startswith=["-mllvm"])
----------------
I would move this before the remove_arg_by_index call since all llvm args start with a `-` and try_remove_arg_by_index will cause lots of invalid command lines to be created otherwise.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D59725/new/
https://reviews.llvm.org/D59725
More information about the cfe-commits
mailing list