[LLVMdev] Alias analysis issue with structs on PPC
Olivier Sallenave
ol.sall at gmail.com
Sun Mar 15 19:01:02 PDT 2015
Reason is that I want this case to be supported by default, without
specific flags (should we change the default AA for PPC if we do a separate
one?).
Fixing at the Clang level might be the right thing to do... But if there
are other cases like this, the traversal would definitely help in general
to get a proper handling of restrict.
Olivier
On Sunday, March 15, 2015, Daniel Berlin <dberlin at dberlin.org> wrote:
>
>
> On Sun, Mar 15, 2015 at 4:34 PM Olivier Sallenave <ol.sall at gmail.com
> <javascript:_e(%7B%7D,'cvml','ol.sall at gmail.com');>> wrote:
>
>> Hi Daniel,
>>
>> Thanks for your feedback. I would prefer not to write a new AA. Can't we
>> directly implement that traversal in BasicAA?
>>
> Can I ask why?
> Outside of the "well, it's another pass", i mean?
>
> BasicAA is stateless, so you can't cache, and you really don't want to
> redo these walks repeatedly (especially when the answer doesn't change
> unless AA is invalidated). It would be really expensive.
> Doing this in a separate analysis pass, caching the answer, and producing
> AA results, seems to me exactly the right thing.
>
>
>> Otherwise, I'll investigate why this i64 was generated in the first place
>> (but like you, I don't really want to know why :-)
>>
>
> Reid walked over to my desk and told me all the gory details - the clang
> ABI lowering of structs like these for anything but x86 basically says
> "oh, this really should be 2 8 byte GPR's, and the way it makes this
> happen is by using i64's :)
>
> If you want to do it at a clang level, the right thing to do is to fixup
> the ABI lowerings for pointers to keep them pointers in this case.
>
> For something simpler, if you wanted, you could do the *exact* opposite of
> happens now and you'd get better results.
>
> This is, pass everything that needs to be 2 8 byte regs as 8 byte
> pointers, and cast it back to something if it's not really a pointer, using
> ptrtoint.
>
> if it's not really a pointer, we don't care what AA says about it.
> If it is a pointer, now we don't get bad AA answers, because there is no
> inttoptr being used like there is now.
>
> Of course, i have no idea if this strategy is going to produce correct ABI
> results on all platforms, it depends on whether they treat pointers as
> special.
>
> Olivier
>>
>> 2015-03-13 18:51 GMT-04:00 Daniel Berlin <dberlin at dberlin.org
>> <javascript:_e(%7B%7D,'cvml','dberlin at dberlin.org');>>:
>>
>>>
>>>
>>> On Fri, Mar 13, 2015 at 2:54 PM Daniel Berlin <dberlin at dberlin.org
>>> <javascript:_e(%7B%7D,'cvml','dberlin at dberlin.org');>> wrote:
>>>
>>>> On Fri, Mar 13, 2015 at 2:39 PM Olivier H Sallenave <
>>>> ohsallen at us.ibm.com
>>>> <javascript:_e(%7B%7D,'cvml','ohsallen at us.ibm.com');>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have the following C loop to vectorize:
>>>>>
>>>>> struct box {
>>>>> double* source;
>>>>> };
>>>>>
>>>>> void test(double* restrict result, struct box my_struct, int len)
>>>>> {
>>>>> for (int i=0 ; i<len; i++) {
>>>>> result[i] = my_struct.source[i] * my_struct.source[i];
>>>>> }
>>>>> }
>>>>>
>>>>> There are two references in the loop, result[i] (restrict) and
>>>>> my_struct.source[i] (readonly). The compiler should easily figure out that
>>>>> they do not alias.
>>>>>
>>>>> Compiling for x86, the loop alias analysis works just fine:
>>>>> AST: Alias Set Tracker: 2 alias sets for 2 pointer values.
>>>>> AliasSet[0x7fd8e2f32290, 1] must alias, No access Pointers: (double*
>>>>> %arrayidx5, 18446744073709551615)
>>>>> AliasSet[0x7fd8e2f322e0, 1] must alias, No access Pointers: (double*
>>>>> %arrayidx, 18446744073709551615)
>>>>>
>>>>> Compiling for PPC with -target powerpc64le-ibm-linux-gnu, the two
>>>>> addresses now alias:
>>>>> AST: Alias Set Tracker: 1 alias sets for 2 pointer values.
>>>>> AliasSet[0x7f931bd5bdc0, 2] may alias, No access Pointers: (double*
>>>>> %arrayidx5, 18446744073709551615), (double* %arrayidx, 18446744073709551615)
>>>>>
>>>>> BasicAA is used for both targets by default. The difference is that in
>>>>> PPC, the IR obtained from Clang takes an i64 as parameter instead of a
>>>>> double* for my_struct.
>>>>>
>>>>
>>>> I don't even want to know why this would be the case :)
>>>>
>>>>
>>>>> This parameter is then coerced into double* using an inttoptr
>>>>> instruction. The code in BasicAliasAnalysis.cpp which is triggered for x86
>>>>> is the following:
>>>>>
>>>>> // Function arguments can't alias with things that are known to be
>>>>> // unambigously identified at the function level.
>>>>> if ((isa<Argument>(O1) && isIdentifiedFunctionLocal(O2)) ||
>>>>> (isa<Argument>(O2) && isIdentifiedFunctionLocal(O1)))
>>>>> return NoAlias;
>>>>>
>>>>> isIdentifiedFunctionLocal(V) returns true for a noalias argument (such
>>>>> as result), but the other address (my_struct) must be a function argument
>>>>> in order to return NoAlias, which is not the case anymore for PPC (since
>>>>> my_struct is now the result from an inttoptr instruction). If I understand,
>>>>> the problem is that we cannot trust the fact that locals do not alias with
>>>>> restrict parameters (because the compiler could generate some locals which
>>>>> alias)?
>>>>>
>>>> Yes, because pointers *based on* the noalias'd argument are legal
>>>> aliases.
>>>>
>>>> So if you don't know it's an argument or an identified local, it could
>>>> be based on the restricted pointer, and thus, alias it.
>>>>
>>>>
>>>> If someone has suggestions about this, that would help a lot.
>>>>>
>>>>
>>>> The only way you could prove something in this case would be to walk
>>>> the chain and prove the value comes directly from an argument with no
>>>> modification.
>>>>
>>>
>>> Actually, you could do the opposite, too, pretty cheaply.
>>>
>>> You could write a new pass or AA.
>>> It traverses chains in the reverse direction (IE it goes from the
>>> arguments, and walks down the immediate use chain, marking things as based
>>> on arguments or not), and makes a lookup table of things it can prove are
>>> also unmolested identified objects.
>>> (which would be the result of inttoptr in your case).
>>>
>>> You can then use this simple lookup table to answer the
>>> isIdentifiedObject question better.
>>> (You'd have to make isIdentifiedObject part of the AA interface, or take
>>> an optional table, blah blah blah)
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu
>>> <javascript:_e(%7B%7D,'cvml','LLVMdev at cs.uiuc.edu');>
>>> http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150315/001e2549/attachment.html>
More information about the llvm-dev
mailing list