[LLVMdev] Alias analysis issue with structs on PPC

Tue Mar 17 04:55:11 PDT 2015

----- Original Message -----
> From: "Ulrich Weigand" <Ulrich.Weigand at de.ibm.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Daniel Berlin" <dberlin at dberlin.org>, llvmdev at cs.uiuc.edu, "Olivier Sallenave" <ol.sall at gmail.com>
> Sent: Tuesday, March 17, 2015 4:56:48 AM
> Subject: Re: [LLVMdev] Alias analysis issue with structs on PPC
> 
> Hal Finkel <hfinkel at anl.gov> wrote on 16.03.2015 17:56:20:
> 
> > If you want to do it at a clang level, the right thing to do is to
> > fixup the ABI lowerings for pointers to keep them pointers in this
> > case.
> > So this is an artifact of the way that we pass structures, and
> > constructing a general solution at the ABI level might be tricky.
> > I've cc'd Uli, who did most of the recent work here.
> >
> > For the single-element struct case, we could fix this by keeping it
> > a pointer type.  The relevant code in Clang is in lib/CodeGen/
> > TargetInfo.cpp (look at PPC64_SVR4_ABIInfo::classifyArgumentType
> > and
> > nearby code). But that does not really address the underlying
> > issue:
> >
> > If I take your example and modify it so that we have:
> >
> > struct box {
> >     double* source;
> >     double* source2;
> > };
> >
> > then the parameter is passed as:
> >
> > define void @test(double* noalias nocapture %result, [2 x i64] %
> > my_struct.coerce, i32 signext %len) #0
> >
> > and is extracted the same way:
> >
> >   %my_struct.coerce.fca.0.extract = extractvalue [2 x i64] %
> > my_struct.coerce, 0
> >   %0 = inttoptr i64 %my_struct.coerce.fca.0.extract to double*
> >
> > but, it is also important to realize that the i64 here in the array
> > type is actually chosen to satisfy alignment requirements. If we
> > have
> this:
> >
> > typedef float __attribute__((ext_vector_type(4))) vt;
> > struct box {
> >     double* source;
> >     double* source2;
> >     vt v;
> > };
> >
> > then the struct is passed as:
> >
> > define void @test(double* noalias nocapture %result, [2 x i128] %
> > my_struct.coerce, i32 signext %len)
> >
> > and the extraction code looks like:
> >
> >   %my_struct.coerce.fca.0.extract = extractvalue [2 x i128] %
> > my_struct.coerce, 0
> >   %my_struct.sroa.0.sroa.0.0.extract.shift = lshr i128 %
> > my_struct.coerce.fca.0.extract, 64
> >   %my_struct.sroa.0.sroa.0.0.extract.trunc = trunc i128 %
> > my_struct.sroa.0.sroa.0.0.extract.shift to i64
> >   %0 = inttoptr i64 %my_struct.sroa.0.sroa.0.0.extract.trunc to
> >   double*
> >
> > so just using pointer types instead of i64 will help common cases,
> > but will not address the general issue. Now part of this does some
> > down to using array parameters as a substitute for byval/direct
> > parameters. As I recall, this was done because it allowed a natural
> > partial decomposition between GPRs and stack for structures that
> > straddle the number of available parameter-passing GPRs. If we
> > could
> > accomplish that with regular byval parameters and regular direct
> > parameters, then we'd not need any of this array coercion, and the
> > system, including for the purposes of aliasing analysis, would work
> > as intended. There may be some infrastructure work required in the
> > backend (SelectionDAG builder, etc.) -- Uli, if you know please
> > comment -- but I think moving away from the array coercions might
> > be
> > the right solution, even if that requires some infrastructure
> enhancements.
> 
> I still think "byval" is the wrong approach here.  Using "byval",
> the LLVM target-independent codegen layers assume the ABI requires
> actually passing a *pointer* to the argument -- but that's not true
> on PowerPC64, no pointer is in fact being passed.
> 
> This means that the back-end, in those cases where we use byval,
> has to fake up an address when common code asks for the value of
> the argument.  This is not too bad if the argument was in fact
> actually passed (fully) on the stack, but if it wasn't, we have
> to write the argument (partially) to the stack, and possibly
> even allocate stack space, just to satisfy common code ...
> 
> Now, passing such arguments as "direct" might make more sense.  But
> with the current infrastructure, this doesn't really work either.
> For one, there is currently no place to attach alignment information
> to a "direct" argument, and the back-end is not able in all cases to
> reconstruct the required on-stack alignment from the LLVM type info.
> In addition, passing a struct type "direct" causes clang codegen
> common code in many cases to "flatten" the argument, i.e. pass the
> struct elements as separate arguments on the LLVM IR level.  This
> may in some cases be OK since it results in the same binary ABI;
> but in other cases it is definitively wrong.
> 
> Still, if we need to move away from coerced array types, fixing
> the infrastructure such that "direct" can be used would be my
> prefered solution.  If I understood the original email thread
> correctly, the current situation does not lead to incorrect code
> generation, just potentially suboptimal code generation?  In this
> case, I guess an incremental solution could be employed; we could
> start with avoiding the array coercion in those cases where using
> the struct type as direct type is already safe even with the
> current infrastructure, and then expand the set of types where
> this is true by successively improving the infrastructure.

That's correct, the code is correct, but potentially suboptimal. I agree with your assessment, and an incremental approach certainly makes sense. Regarding the general case, which requires infrastructure enhancement, we just need to decide what improvement is needed. As I mentioned in the e-mail to Reid, we could certainly attach an 'align' attribute to a 'direct' struct argument, and update the infrastructure so that it means the right thing.

I did a quick experiment, and changed the IR from the previous example so that we have:

%struct.box3 = type { double*, double*, <4 x float> }

and the function taking the parameter like this:

define void @test3(double* noalias nocapture %result, %struct.box3 %my_struct, i32 %len)

and then extracting the first member like this:

  %0 = extractvalue %struct.box3 %my_struct, 0

and I ran it though the standard pipeline at -O3, and nothing changed that (so we, at least, don't always break up these FCA arguments into separate parameters). What I mean above is that it might make sense to change things so that we're allowed to use this:

  define void @test3(double* noalias nocapture %result, %struct.box3 %my_struct align 16, i32 %len)

and then I imagine that the SDAGBuilder needs to keep track of which parameter values being lowered are part of a structure (in the same way as it keeps track of this for arrays currently), if it does not do this already.

Thanks again,
Hal

> 
> Bye,
> Ulrich
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory