[LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass
Tobias Grosser
tobias at grosser.es
Mon Jul 1 08:47:06 PDT 2013
On 07/01/2013 06:51 AM, Star Tan wrote:
>> Great. Now we have two test cases we can work with. Can you
>
>> upload the LLVM-IR produced by clang -O0 (without Polly)?
> Since tramp3d-v4.ll is to large (19M with 267 thousand lines), I would focus on the oggenc benchmark at firat.
> I attached the oggenc.ll (LLVM-IR produced by clang -O0 without Polly), which compressed into the file oggenc.tgz.
Sounds good.
>> 2) Check why the Polly scop detection is failing
>>
>> You can use 'opt -polly-detect -analyze' to see the most common reasons
>> the scop detection failed. We should verify that we perform the most
>> common and cheap tests early.
>>
> I also attached the output file oggenc_polly_detect_analyze.log produced by "polly-opt -O3 -polly-detect -analyze oggenc.ll". Unfortunately, it only dumps valid scop regions. At first, I thought to dump all debugging information by "-debug" option, but it will dump too many unrelated information produced by other passes. Do you know any option that allows me to dump debugging information for the "-polly-detect" pass, but at the same time disabling debugging information for other passes?
I really propose to not attach such large files. ;-)
To dump debug info of just one pass you can use
-debug-only=polly-detect. However, for performance measurements, you
want to use
a release build to get accurate numbers.
Another flag that is interesting is the flag '-stats'. It gives me the
following information:
4 polly-detect
- Number of bad regions for Scop: CFG too complex
183 polly-detect
- Number of bad regions for Scop: Expression not affine
103 polly-detect
- Number of bad regions for Scop: Found base address
alias
167 polly-detect
- Number of bad regions for Scop: Found invalid region
entering edges
59 polly-detect
- Number of bad regions for Scop: Function call with
side effects appeared
725 polly-detect
- Number of bad regions for Scop: Loop bounds can not
be computed
93 polly-detect
- Number of bad regions for Scop: Non canonical
induction variable in loop
8 polly-detect
- Number of bad regions for Scop: Others
53 polly-detect
- Number of regions that a valid part of Scop
This seems to suggest that we most scops fail due to loop bounds that
can not be computed. It would be interesting to see what kind of
expressions these are. In case SCEV often does not deliver a result,
this may be one of the cases where bottom up scop detection would help
a lot, as outer regions are automatically invalidated if we can not get
a SCEV for the loop bounds of the inner regions.
However, I still have the feeling the test case is too large. You can
reduce it I propose to first run opt with 'opt -O3 -polly
-disable-inlining -time-passes'. You then replace all function
definitions with
s/define internal/define/. After this preprocessing you can use a regexp
such as "'<,'>s/define \([^{}]* \){\_[^{}]*}/declare \1" to replace
function definitions with their declaration. You can use this to binary
search for functions that have a large overhead in ScopDetect time.
I tried this a little, but realized that no matter if I removed the
first or the second part of a module, the relative scop-detect time
always went down. This is surprising. If you see similar effects, it
would be interesting to investigate.
Cheers,
tobi
Cheers,
Tobi
More information about the llvm-dev
mailing list