[LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass

Tobias Grosser tobias at grosser.es
Mon Jul 1 08:47:06 PDT 2013


On 07/01/2013 06:51 AM, Star Tan wrote:
>> Great. Now we have two test cases we can work with. Can you
>
>> upload the LLVM-IR produced by clang -O0 (without Polly)?
> Since tramp3d-v4.ll is to large (19M with 267 thousand lines), I would focus on the oggenc benchmark at firat.
> I attached the oggenc.ll (LLVM-IR produced by clang -O0 without Polly), which compressed into the file oggenc.tgz.

Sounds good.

>> 2) Check why the Polly scop detection is failing
>>
>> You can use 'opt -polly-detect -analyze' to see the most common reasons
>> the scop detection failed. We should verify that we perform the most
>> common and cheap tests early.
>>
> I also attached the output file oggenc_polly_detect_analyze.log produced by "polly-opt -O3 -polly-detect -analyze oggenc.ll". Unfortunately, it only dumps valid scop regions. At first, I thought to dump all debugging information by "-debug" option, but it will dump too many unrelated information produced by other passes. Do you know any option that allows me to dump debugging information for the "-polly-detect" pass, but at the same time disabling debugging information for other passes?

I really propose to not attach such large files. ;-)

To dump debug info of just one pass you can use 
-debug-only=polly-detect. However, for performance measurements, you 
want to use
a release build to get accurate numbers.

Another flag that is interesting is the flag '-stats'. It gives me the 
following information:

     4 polly-detect
                  - Number of bad regions for Scop: CFG too complex
   183 polly-detect
                  - Number of bad regions for Scop: Expression not affine
   103 polly-detect
                  - Number of bad regions for Scop: Found base address
                    alias
   167 polly-detect
                  - Number of bad regions for Scop: Found invalid region
                    entering edges
    59 polly-detect
                  - Number of bad regions for Scop: Function call with
                    side effects appeared
   725 polly-detect
                  - Number of bad regions for Scop: Loop bounds can not
                    be computed
    93 polly-detect
                  - Number of bad regions for Scop: Non canonical
                    induction variable in loop
     8 polly-detect
                  - Number of bad regions for Scop: Others
    53 polly-detect
                  - Number of regions that a valid part of Scop

This seems to suggest that we most scops fail due to loop bounds that 
can not be computed. It would be interesting to see what kind of 
expressions these are. In case SCEV often does not deliver a result,
this may be one of the cases where bottom up scop detection would help
a lot, as outer regions are automatically invalidated if we can not get 
a SCEV for the loop bounds of the inner regions.

However, I still have the feeling the test case is too large. You can 
reduce it I propose to first run opt with 'opt -O3 -polly 
-disable-inlining -time-passes'. You then replace all function 
definitions with
s/define internal/define/. After this preprocessing you can use a regexp 
such as "'<,'>s/define \([^{}]* \){\_[^{}]*}/declare \1" to replace 
function definitions with their declaration. You can use this to binary 
search for functions that have a large overhead in ScopDetect time.

I tried this a little, but realized that no matter if I removed the 
first or the second part of a module, the relative scop-detect time 
always went down. This is surprising. If you see similar effects, it 
would be interesting to investigate.

Cheers,
tobi

Cheers,
Tobi




More information about the llvm-dev mailing list