<div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><span style="white-space: pre-wrap; line-height: 1.7;">At 2013-07-01 23:47:06,"Tobias Grosser" <tobias@grosser.es> wrote:</span><br><pre>>On 07/01/2013 06:51 AM, Star Tan wrote:
>>> Great. Now we have two test cases we can work with. Can you
>>
>>> upload the LLVM-IR produced by clang -O0 (without Polly)?
>> Since tramp3d-v4.ll is to large (19M with 267 thousand lines), I would focus on the oggenc benchmark at firat.
>> I attached the oggenc.ll (LLVM-IR produced by clang -O0 without Polly), which compressed into the file oggenc.tgz.
>
>Sounds good.
>
>>> 2) Check why the Polly scop detection is failing
>>>
>>> You can use 'opt -polly-detect -analyze' to see the most common reasons
>>> the scop detection failed. We should verify that we perform the most
>>> common and cheap tests early.
>>>
>> I also attached the output file oggenc_polly_detect_analyze.log produced by "polly-opt -O3 -polly-detect -analyze oggenc.ll". Unfortunately, it only dumps valid scop regions. At first, I thought to dump all debugging information by "-debug" option, but it will dump too many unrelated information produced by other passes. Do you know any option that allows me to dump debugging information for the "-polly-detect" pass, but at the same time disabling debugging information for other passes?
>
>I really propose to not attach such large files. ;-)
>
>To dump debug info of just one pass you can use
>-debug-only=polly-detect. However, for performance measurements, you
>want to use
>a release build to get accurate numbers.
>
>Another flag that is interesting is the flag '-stats'. It gives me the
>following information:
>
> 4 polly-detect
> - Number of bad regions for Scop: CFG too complex
> 183 polly-detect
> - Number of bad regions for Scop: Expression not affine
> 103 polly-detect
> - Number of bad regions for Scop: Found base address
> alias
> 167 polly-detect
> - Number of bad regions for Scop: Found invalid region
> entering edges
> 59 polly-detect
> - Number of bad regions for Scop: Function call with
> side effects appeared
> 725 polly-detect
> - Number of bad regions for Scop: Loop bounds can not
> be computed
> 93 polly-detect
> - Number of bad regions for Scop: Non canonical
> induction variable in loop
> 8 polly-detect
> - Number of bad regions for Scop: Others
> 53 polly-detect
> - Number of regions that a valid part of Scop
>
>This seems to suggest that we most scops fail due to loop bounds that
>can not be computed. It would be interesting to see what kind of
>expressions these are. In case SCEV often does not deliver a result,
>this may be one of the cases where bottom up scop detection would help
>a lot, as outer regions are automatically invalidated if we can not get
>a SCEV for the loop bounds of the inner regions.</pre><pre>Thank you so much. This is what I need. I just want to know why these scops are invalid!</pre><pre>
>
>However, I still have the feeling the test case is too large. You can
>reduce it I propose to first run opt with 'opt -O3 -polly
>-disable-inlining -time-passes'. You then replace all function
>definitions with
>s/define internal/define/. After this preprocessing you can use a regexp
>such as "'<,'>s/define \([^{}]* \){\_[^{}]*}/declare \1" to replace
>function definitions with their declaration. You can use this to binary
>search for functions that have a large overhead in ScopDetect time.
>
>I tried this a little, but realized that no matter if I removed the
>first or the second part of a module, the relative scop-detect time
>always went down. This is surprising. If you see similar effects, it
>would be interesting to investigate.
</pre><pre>No problem. I will try to reduce code size.</pre><pre>Bests,</pre><pre>Star Tan</pre></div>