[LLVMdev] Functions: sret and readnone

Stephan stephan.reiter at gmail.com
Thu Nov 5 05:32:01 PST 2009


It's been a while and I finally had the time to look into this.

What I did was to build a custom AliasAnalysis pass, as Chris
suggested, that returns AliasAnalysis::Mod for values passed to the
sample function in the sret spot, and NoModRef for all other values.
I'm also returning AliasAnalysis::AccessesArguments in the pass'
getModRefBehavior methods. However, I haven't been successful with
this approach and hope that someone has an idea on how to fix this.

Here's a step by step illustration of the problem:

1. The following source code is compiled ...

intrinsic float4 sample(int tex, float2 tc);

float4 main(int tex, float2 tc)
{
	float4 x = sample(tex, tc);
	return 0.0;
}

2. ... into the following LLVM code (after a bunch of optimizations
have run):

define void @"main$int$float2"([4 x float]* noalias nocapture sret,
i32, [2 x float]) nounwind {
  %5 = alloca [4 x float], align 4                ; <[4 x float]*>
[#uses=1]
  call void @"sample$int$float2"([4 x float]* %5, i32 %1, [2 x float]
%2)
  store [4 x float] zeroinitializer, [4 x float]* %0
  ret void
}

declare void @"sample$int$float2"([4 x float]* noalias nocapture sret,
i32, [2 x float]) nounwind

As you can see, the call to the sample function is still present,
although the actual value it is supposed to return via its sret
parameter is never used.

Using the AAEvalPass I found out that the alias analysis pass I
implemented seems to work alright (it reports mod for %5):

===== Alias Analysis Evaluator Report =====
  3 Total Alias Queries Performed
  3 no alias responses (100.0%)
  0 may alias responses (0.0%)
  0 must alias responses (0.0%)
  Alias Analysis Evaluator Pointer Alias Summary: 100%/0%/0%
  3 Total ModRef Queries Performed
  2 no mod/ref responses (66.6%)
  1 mod responses (33.3%)
  0 ref responses (0.0%)
  0 mod & ref responses (0.0%)
  Alias Analysis Evaluator Mod/Ref Summary: 66%/33%/0%/0%

Yet, DCE, DSE and GVN fail to remove the function call. (I'm not so
sure which optimization pass to use, so I picked these three as they
seemed to make sense.)

Any ideas? Help would be very much appreciated!

Thank you,
Stephan

On 6 Okt., 08:00, Stephan <stephan.rei... at gmail.com> wrote:
> On 5 Okt., 23:33, Dan Gohman <goh... at apple.com> wrote:
>
>
>
> > Is there a reason it needs to be an array? A vector of four floats
> > wouldn't have this problem, if that's an option.
>
> Unfortunately that's not an option. At the moment I'm restricting
> myself to the use of scalar code only, in order to be able to
> vectorize the code easily later (e.g., float4 as it is now will then
> become an array of four vectors for parallel processing of n (probably
> 4, SSE) pixels). But thanks for coming up with this idea!
>
> Chris, I'll take a look at the AliasAnalysis functionality. Depending
> on how much effort it is to implement a solution I might follow this
> approach. If not, there's still Kenneth's new code generator to look
> forward to. :)
>
> Thanks,
> Stephan
> _______________________________________________
> LLVM Developers mailing list
> LLVM... at cs.uiuc.edu        http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list