[llvm-dev] flags to reproduce clang -O3 with opt -O3
Chad Verbowski via llvm-dev
llvm-dev at lists.llvm.org
Sun Jul 5 12:57:37 PDT 2020
FYI for others. I (think I) figured out how to dump the llc flags:
llvm-as < /dev/null | llc -O3 -debug-pass=Arguments
I was surprised that -O1, -O2, -O3 seem to have the same flags, though -O0
is a subset.
On Thu, Jul 2, 2020 at 8:00 PM Chad Verbowski <chad at verbowski.com> wrote:
> Thanks.
>
> My intent is to reduce the overall compile time by eliminating unused
> optimizations.
>
> Do you happen to know if there is a list somewhere (or a way to dump /
> extract them) of the individual flags which llc uses as part of -O3, so I
> can perhaps experiment with removing the unnecessary ones for my code?
>
> On Thu, Jul 2, 2020 at 7:55 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>
>>
>>
>> On Thu, Jul 2, 2020 at 7:50 PM Chad Verbowski <chad at verbowski.com> wrote:
>>
>>> Awesome, thanks!
>>>
>>> I'd like to have the last step (llc in your example) not perform
>>> additional optimization passes, such as O3, and simply use the O3 pass from
>>> opt in the previous line.
>>>
>>> Do you happen to know if I should use 'llc -O0 foo_o.bc -o foo.exe'
>>> instead to achieve this?
>>>
>>
>> No you should use `llc -O3`: this is controlling only the backend part of
>> the pipeline.
>>
>>
>>>
>>> On Thu, Jul 2, 2020 at 6:35 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Jul 2, 2020 at 2:28 PM Chad Verbowski via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I've been trying to figure out how to reproduce the results of a
>>>>> single clang -O3 compilation to a binary with a multi-step process using
>>>>> opt.
>>>>>
>>>>> Specifically I have:
>>>>>
>>>>> clang -O3 foo.c -o foo.exe
>>>>>
>>>>>
>>>>> which I want to replicate with:
>>>>>
>>>>> clang -O0 -c -emit-llvm foo.c
>>>>>
>>>>>
>>>> Using O0 will mark every function in the IR with "optnone" which
>>>> prevents `opt` from optimizing it. I'd try `clang -O3 -Xclang
>>>> -disable-llvm-passes -c -emit-llvm foo.c`
>>>>
>>>>
>>>>> opt -O3 foo.bc -o foo_o.bc
>>>>> clang foo_o.bc -o foo.exe
>>>>>
>>>>>
>>>> This last step won't enable optimizations in the backend, you likely
>>>> should try `llc -O3` instead.
>>>>
>>>> Best,
>>>>
>>>> --
>>>> Mehdi
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> Any hints / suggestions on what additional flags I need to produce the
>>>>> same binary are greatly appreciated!
>>>>>
>>>>> *What I've tried:*
>>>>> I've been reading the archives, and found this
>>>>> <http://lists.llvm.org/pipermail/llvm-dev/2017-September/117144.html>,
>>>>> which suggests dumping the pass arguments using:
>>>>>
>>>>> clang -mllvm -debug-pass=Structure -O3 foo.c -o foo.exe
>>>>>
>>>>> and comparing with:
>>>>>
>>>>> clang -mllvm -debug-pass=Structure -O0 -c -emit-llvm foo.c
>>>>> opt -debug-pass=Structure -O3 foo.bc -o foo_o.bc
>>>>> clang -mllvm -debug-pass=Structure foo_o.bc -o foo.exe
>>>>>
>>>>>
>>>>> The first has 30 "Pass Argument" statements though only these 5 are
>>>>> distinct. Across these 5 there are 190 distinct flags. The multi-step
>>>>> compilation has only 140 distinct flags. Comparing the flags, 18 from the
>>>>> multi-step are missing in the 1pass, and 67 from 1pass are missing in the
>>>>> multistep.
>>>>>
>>>>> These appear to be opt flags, since they cause an error when trying to
>>>>> use them with clang (e.g. -x86-fixup-LEAs) and when used with opt
>>>>> causes a crash with stack dump and request to submit a bug report. Others
>>>>> like -attributor appear to work with opt.
>>>>>
>>>>> I'm currently blindly trying to add the 67 different flags to the opt
>>>>> step to see which work, and hopefully that subset will produce the same
>>>>> result as clang -O3.
>>>>>
>>>>> It seems like there must be an easier / more exact way of getting the
>>>>> opt -O3 multi-step to match the clang -O3 result.
>>>>>
>>>>> Any thoughts or insights are appreciated. Below is a sorted list of
>>>>> the flags missing from each for completeness.
>>>>>
>>>>> not contained in 1pass O3 (count=18)
>>>>>
>>>>> -aa-scalar-evolution
>>>>>
>>>>> -always-inline
>>>>>
>>>>> -callsite-splitting
>>>>>
>>>>> -inject-tli-mappings
>>>>>
>>>>> -ipsccp
>>>>>
>>>>> -jump-threading-correlated-propagation
>>>>>
>>>>> -livedebugvalues
>>>>>
>>>>> -loops-loop-simplify
>>>>>
>>>>> -memdep-lazy-branch-prob
>>>>>
>>>>> -openmpopt
>>>>>
>>>>> -opt-remark-emitter-instcombine
>>>>>
>>>>> -regallocfast
>>>>>
>>>>> -speculative-execution
>>>>>
>>>>> -stackmap-liveness
>>>>>
>>>>> -tbaa-scoped-noalias
>>>>>
>>>>> -vector-combine
>>>>>
>>>>> -verify
>>>>>
>>>>> -write-bitcode
>>>>>
>>>>> not contained in multi O3 (count=67)
>>>>>
>>>>> -attributor
>>>>>
>>>>> -block-freq-loop-simplify
>>>>>
>>>>> -branch-folder
>>>>>
>>>>> -break-false-deps
>>>>>
>>>>> -callsite-splitting-ipsccp
>>>>>
>>>>> -codegenprepare
>>>>>
>>>>> -consthoist
>>>>>
>>>>> -dead-mi-elimination
>>>>>
>>>>> -detect-dead-lanes
>>>>>
>>>>> -early-ifcvt
>>>>>
>>>>> -early-machinelicm
>>>>>
>>>>> -early-tailduplication
>>>>>
>>>>> -expandmemcmp
>>>>>
>>>>> -greedy
>>>>>
>>>>> -interleaved-access
>>>>>
>>>>> -iv-users
>>>>>
>>>>> -lazy-block-freq-opt-remark-emitter
>>>>>
>>>>> -livedebugvars
>>>>>
>>>>> -liveintervals
>>>>>
>>>>> -liveregmatrix
>>>>>
>>>>> -livestacks
>>>>>
>>>>> -livevars
>>>>>
>>>>> -loop-reduce
>>>>>
>>>>> -loop-simplify-lcssa-verification
>>>>>
>>>>> -lrshrink
>>>>>
>>>>> -machine-block-freq
>>>>>
>>>>> -machine-combiner
>>>>>
>>>>> -machine-cp
>>>>>
>>>>> -machine-cse
>>>>>
>>>>> -machinedomtree-machine-loops
>>>>>
>>>>> -machinelicm
>>>>>
>>>>> -machine-loops
>>>>>
>>>>> -machinepostdomtree
>>>>>
>>>>> -machinepostdomtree-block-placement
>>>>>
>>>>> -machine-scheduler
>>>>>
>>>>> -machine-sink
>>>>>
>>>>> -machine-trace-metrics
>>>>>
>>>>> -mergeicmps
>>>>>
>>>>> -objc-arc-contract
>>>>>
>>>>> -opt-phis
>>>>>
>>>>> -partially-inline-libcalls
>>>>>
>>>>> -peephole-opt
>>>>>
>>>>> -postra-machine-sink
>>>>>
>>>>> -post-RA-sched
>>>>>
>>>>> -processimpdefs
>>>>>
>>>>> -reaching-deps-analysis
>>>>>
>>>>> -rename-independent-subregs
>>>>>
>>>>> -shrink-wrap
>>>>>
>>>>> -simple-register-coalescing
>>>>>
>>>>> -slotindexes
>>>>>
>>>>> -spill-code-placement
>>>>>
>>>>> -stack-coloring
>>>>>
>>>>> -stackmap-liveness-livedebugvalues
>>>>>
>>>>> -stack-slot-coloring
>>>>>
>>>>> -tailduplication
>>>>>
>>>>> -unreachable-mbb-elimination
>>>>>
>>>>> -virtregmap
>>>>>
>>>>> -virtregrewriter
>>>>>
>>>>> -x86-avoid-SFB
>>>>>
>>>>> -x86-cf-opt
>>>>>
>>>>> -x86-cmov-conversion
>>>>>
>>>>> -x86-domain-reassignment
>>>>>
>>>>> -x86-evex-to-vex-compress
>>>>>
>>>>> -x86-execution-domain-fix
>>>>>
>>>>> -x86-fixup-bw-insts
>>>>>
>>>>> -x86-fixup-LEAs
>>>>>
>>>>> -x86-optimize-LEAs
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200705/87f8eba5/attachment.html>
More information about the llvm-dev
mailing list