[llvm-dev] [RFC][SLP] Let's turn -slp-vectorize-hor on by default
Charlie Turner via llvm-dev
llvm-dev at lists.llvm.org
Tue Nov 10 06:49:32 PST 2015
> Out of curiosity, how much of the compile time are we spending in the SLP vectorizer nowadays ?
My measurements were originally based off the "real time" reports from
/usr/bin/time (not the bash built-in), so I didn't have per-pass
statistics to hand. I did a quick experiment in which I compiled each
of the SPEC files with opt's -time-passes feature.
The "raw" numbers show that SLP can take anywhere from 0 to 30% of the
total optimization time. At the high end of that scale, things are a
bit fast and loose. Some of the biggest offenders are in rather small
bitcode files (where the total compile time is getting very small as
well)
The largest bitcode file[*] I had in SPEC2006 was about 1MiB. For that
particular example, SLP took less than 1% of the opt time.
For all bitcode files in SPEC2006 between 100KiB and 1MiB, SLP takes
less than 5% of compile time.
In tensor.bc (~ 80KiB) from SPEC2006, SLP took around 9.5% (+- 1%).
This was a borderline case of a compile-time impact with horizontal
reductions (about a 0.8% regression, so within stddev). There were
actually swings the other way as well (i.e., SLP slower without
horizontal reduction detection, so it's hard to make any judgment
here)
Another pretty interesting one is fnpovfpu.bc (~ 40KiB), where SLP
took 17% of compile time.
Anyway, I hope that gives a rough impression of what's going on. I was
taking the wall clock time measurement from -time-passes.
[*] I screwed up initially not reporting the overall compile time in
my haste, so as a proxy metric, I went back and collected bitcode file
sizes, which saved me from having to rerun everything :/
More information about the llvm-dev
mailing list