[llvm-dev] FW: Restrict qualifier on class members
Jeroen Dobbelaere via llvm-dev
llvm-dev at lists.llvm.org
Wed Jun 24 01:32:09 PDT 2020
Hi Bandhav,
I did notice in the previous example, that the vectorizer used the noalias information, but that it also got stripped during the vectorization.
That is certainly one of the places where the noalias handling can be improved.
It would be interesting to see if the necessary information gets through to the MIR level when vectorization is disabled.
That should be visible with the 'ptr_provenance' field.
Greetings,
Jeroen Dobbelaere
From: Bandhav Veluri <bandhav.veluri00 at gmail.com>
Sent: Wednesday, June 24, 2020 10:12
To: Jeroen Dobbelaere <dobbel at synopsys.com>
Cc: Neil Henning <neil.henning at unity3d.com>; llvm-dev at lists.llvm.org; Kruse, Michael <michael.kruse at anl.gov>
Subject: Re: FW: [llvm-dev] Restrict qualifier on class members
Hi Jeroen,
Sorry, I missed that. I tried the patch, and this program:
#include <stdint.h>
#define __remote __attribute__((address_space(1)))
__remote int* A;
__remote int* B;
void vec_add(__remote int* __restrict a,
__remote int* __restrict b,
int n) {
#pragma unroll 4
for(int i=0; i<n; ++i) {
a[i] += b[i];
}
}
int main(int argc, char** argv) {
__remote int* __restrict a = A;
__remote int* __restrict b = B;
#pragma unroll 4
for(int i=0; i<4; ++i) {
a[i] += b[i];
}
return 0;
}
vec_add give following schedule:
*** Final schedule for %bb.8 ***
SU(0): %33:gpr = LW %56:gpr, -8 :: (load 4 from %ir.scevgep8, !tbaa !14, !noalias !13, addrspace 1)
SU(1): %34:gpr = LW %55:gpr, -8 :: (load 4 from %ir.scevgep14, !tbaa !14, !noalias !13, addrspace 1)
SU(4): %36:gpr = LW %56:gpr, -4 :: (load 4 from %ir.scevgep10, !tbaa !14, !noalias !13, addrspace 1)
SU(5): %37:gpr = LW %55:gpr, -4 :: (load 4 from %ir.scevgep16, !tbaa !14, !noalias !13, addrspace 1)
SU(8): %39:gpr = LW %56:gpr, 0 :: (load 4 from %ir.lsr.iv6, !tbaa !14, !noalias !13, addrspace 1)
SU(9): %40:gpr = LW %55:gpr, 0 :: (load 4 from %ir.lsr.iv12, !tbaa !14, !noalias !13, addrspace 1)
SU(12): %42:gpr = LW %56:gpr, 4 :: (load 4 from %ir.scevgep9, !tbaa !14, !noalias !13, addrspace 1)
SU(13): %43:gpr = LW %55:gpr, 4 :: (load 4 from %ir.scevgep15, !tbaa !14, !noalias !13, addrspace 1)
SU(2): %35:gpr = nsw ADD %34:gpr, %33:gpr
SU(3): SW %35:gpr, %55:gpr, -8 :: (store 4 into %ir.scevgep14, !tbaa !14, !noalias !13, addrspace 1)
SU(6): %38:gpr = nsw ADD %37:gpr, %36:gpr
SU(7): SW %38:gpr, %55:gpr, -4 :: (store 4 into %ir.scevgep16, !tbaa !14, !noalias !13, addrspace 1)
SU(10): %41:gpr = nsw ADD %40:gpr, %39:gpr
SU(11): SW %41:gpr, %55:gpr, 0 :: (store 4 into %ir.lsr.iv12, !tbaa !14, !noalias !13, addrspace 1)
SU(14): %44:gpr = nsw ADD %43:gpr, %42:gpr
SU(15): SW %44:gpr, %55:gpr, 4 :: (store 4 into %ir.scevgep15, !tbaa !14, !noalias !13, addrspace 1)
SU(16): %57:gpr = nuw nsw ADDI %57:gpr, 4
SU(17): %56:gpr = ADDI %56:gpr, 16
SU(18): %55:gpr = ADDI %55:gpr, 16
And main gives following schedule:
*** Final schedule for %bb.0 ***
SU(0): %2:gpr = LUI target-flags(riscv-hi) @A
SU(2): %4:gpr = LUI target-flags(riscv-hi) @B
SU(3): %5:gpr = LW %4:gpr, target-flags(riscv-lo) @B :: (dereferenceable load 4 from @B, !tbaa !9, !noalias !22)
SU(1): %3:gpr = LW %2:gpr, target-flags(riscv-lo) @A :: (dereferenceable load 4 from @A, !tbaa !9, !noalias !22)
SU(4): %6:gpr = LW %5:gpr, 0 :: (load 4 from %ir.3, !tbaa !14, !noalias !22, addrspace 1)
SU(5): %7:gpr = LW %3:gpr, 0 :: (load 4 from %ir.1, !tbaa !14, !noalias !22, addrspace 1)
SU(6): %8:gpr = nsw ADD %7:gpr, %6:gpr
SU(7): SW %8:gpr, %3:gpr, 0 :: (store 4 into %ir.1, !tbaa !14, !noalias !22, addrspace 1)
SU(8): %9:gpr = LW %5:gpr, 4 :: (load 4 from %ir.arrayidx.1, !tbaa !14, !noalias !22, addrspace 1)
SU(9): %10:gpr = LW %3:gpr, 4 :: (load 4 from %ir.arrayidx1.1, !tbaa !14, !noalias !22, addrspace 1)
SU(10): %11:gpr = nsw ADD %10:gpr, %9:gpr
SU(11): SW %11:gpr, %3:gpr, 4 :: (store 4 into %ir.arrayidx1.1, !tbaa !14, !noalias !22, addrspace 1)
SU(12): %12:gpr = LW %5:gpr, 8 :: (load 4 from %ir.arrayidx.2, !tbaa !14, !noalias !22, addrspace 1)
SU(13): %13:gpr = LW %3:gpr, 8 :: (load 4 from %ir.arrayidx1.2, !tbaa !14, !noalias !22, addrspace 1)
SU(14): %14:gpr = nsw ADD %13:gpr, %12:gpr
SU(15): SW %14:gpr, %3:gpr, 8 :: (store 4 into %ir.arrayidx1.2, !tbaa !14, !noalias !22, addrspace 1)
SU(16): %15:gpr = LW %5:gpr, 12 :: (load 4 from %ir.arrayidx.3, !tbaa !14, !noalias !22, addrspace 1)
SU(17): %16:gpr = LW %3:gpr, 12 :: (load 4 from %ir.arrayidx1.3, !tbaa !14, !noalias !22, addrspace 1)
SU(18): %17:gpr = nsw ADD %16:gpr, %15:gpr
SU(20): $x10 = COPY $x0
SU(19): SW %17:gpr, %3:gpr, 12 :: (store 4 into %ir.arrayidx1.3, !tbaa !14, !noalias !22, addrspace 1)
This is great! Memory accesses are marked noalias. I wanted memory accesses to be annotated as noalias to basically remove loop-carried dependencies so that I can reorder them for efficient scheduling. But when I look at Schedule DAG,
For vec_add I see something like this (note BotQ.A, scheduler can choose any of those => no loop carried dependence):
- Latency limited.
** ScheduleDAGMILive::schedule picking next node
Queue BotQ.P:
Queue BotQ.A: 16 15 11 7 3
Cand SU(16) ORDER
Pick Bot ORDER
For main, at best I see something like this:
** ScheduleDAGMILive::schedule picking next node
Cycle: 45 BotQ.A
Queue BotQ.P:
Queue BotQ.A: 12 13
Cand SU(12) ORDER
Cand SU(13) ORDER
In theory, schedules for vec_add and main should be the same right? Is there anything else I should do to make the __restrict remove loop-carried dependence in main?
Attaching IR and scheduler log for reference...
On Mon, Jun 22, 2020 at 3:03 PM Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com<mailto:Jeroen.Dobbelaere at synopsys.com>> wrote:
Hi Bandhav,
as mentioned in the summary of https://reviews.llvm.org/D69542<https://urldefense.com/v3/__https:/reviews.llvm.org/D69542__;!!A4F2R9G_pg!OmBKB80qBacnrqnvVVCAWFkwISI7m6FZ1z4B8XOkDZKUndeom24G2GmH_KktbhCAqzAnX4K4$> :
The base version is b2a37cfe2bda0bc8c4d2e981922b5ac59c429bdc<https://urldefense.com/v3/__https:/reviews.llvm.org/rGb2a37cfe2bda0bc8c4d2e981922b5ac59c429bdc__;!!A4F2R9G_pg!OmBKB80qBacnrqnvVVCAWFkwISI7m6FZ1z4B8XOkDZKUndeom24G2GmH_KktbhCAq3tq-EY3$> (June 12, 2020)
Greetings,
Jeroen Dobbelaere
From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Bandhav Veluri via llvm-dev
Sent: Monday, June 22, 2020 18:32
To: Neil Henning <neil.henning at unity3d.com<mailto:neil.henning at unity3d.com>>
Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>; Kruse, Michael <michael.kruse at anl.gov<mailto:michael.kruse at anl.gov>>
Subject: Re: [llvm-dev] Restrict qualifier on class members
Hi Jeroen,
That's great! I was trying to use the patch, what's the latest version of the project we could apply it on?
Hi Neil,
That seems like what I can do as well! Do you happen to have some examples lying around? Maybe a pointer to the planned presentation, if that's okay?
Thank you,
Bandhav
On Mon, Jun 22, 2020 at 1:55 AM Neil Henning <neil.henning at unity3d.com<mailto:neil.henning at unity3d.com>> wrote:
I was originally going to cover this in my now defunct EuroLLVM talk but... we had this exact same problem on Unity's HPC# Burst compiler - how to track no-aliasing on structs. We were constrained in that we had to make it work with LLVM versions all the way back to shipped LLVM 6, so what we did was:
* Add module-level metadata that tracked whether a given struct member field was no-alias.
* Added our own alias analysis using createExternalAAWrapperPass to register it in the pass pipeline.
This allowed us to have zero modifications to LLVM and do something useful with aliasing. The one 'issue' with it is if you have a stack-allocated struct that is SROA'ed you will lose the info that it was a struct, or if you are in a private/internal linkage function that has the struct as an argument, the opt passes can modify the function signature to lose the struct too. We had to do some mitigations here to get perfect aliasing on our usecases.
Hope this helps,
-Neil.
On Mon, Jun 22, 2020 at 5:44 AM Michael Kruse via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Unfortunately https://llvm.org/docs/LangRef.html#llvm-loop-parallel-accesses-metadata<https://urldefense.com/v3/__https:/llvm.org/docs/LangRef.html*llvm-loop-parallel-accesses-metadata__;Iw!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLJxlIVzl$>
is not a solution here. A loop-parallel access does not imply
non-aliasing. The obvious case is when only reading from a location,
but even when a location is written to I'd be careful to deduce that
they do not alias since it might be a "benign data race" or the value
never used. Additionally, LLVM's loop unroller is known to now handle
noalias metadata correctly as it just copies it.
There has been a discussion here:
http://lists.llvm.org/pipermail/llvm-dev/2020-May/141587.html<https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm-dev/2020-May/141587.html__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLAvIBb7x$>
Michael
Am So., 21. Juni 2020 um 12:24 Uhr schrieb Johannes Doerfert via
llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>:
>
> Hi Bandhav,
>
>
> Jeroen Dobbelaere (CC'ed) is currently working on support for restrict qualified local variables and struct members.
>
> The patches exist but are not merged yet. If you want to give it a try apply https://reviews.llvm.org/D69542<https://urldefense.com/v3/__https:/reviews.llvm.org/D69542__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLLzXOdAP$>.
>
>
> Initially I could only think of this solution for your problem: https://godbolt.org/z/6WtPXJ<https://urldefense.com/v3/__https:/godbolt.org/z/6WtPXJ__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLEju0wQ6$>
>
> Michael (CC'ed) might now another annotation to get `llvm.access` metadata for the loop, which should do what you intend.
>
>
> Cheers,
>
> Johannes
>
>
> On 6/21/20 11:56 AM, Bandhav Veluri via llvm-dev wrote:
>
> Hi,
>
> I'm trying to abstract some special pointers with a class, like in the
> example program below:
>
> 1 #define __remote __attribute__((address_space(1)))
> 2 #include <stdint.h>
> 3
> 4 __remote int* A;
> 5 __remote int* B;
> 6
> 7 class RemotePtr {
> 8 private:
> 9 __remote int* __restrict a;
> 10
> 11 public:
> 12 RemotePtr(__remote int* a) : a(a) {}
> 13
> 14 __remote int& at(int n) {
> 15 return a[n];
> 16 }
> 17 };
> 18
> 19 int main(int argc, char** argv) {
> 20 RemotePtr a(A);
> 21 RemotePtr b(B);
> 22
> 23 #pragma unroll 4
> 24 for(int i=0; i<4; ++i) {
> 25 a.at<https://urldefense.com/v3/__http:/a.at__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLOY-9qOq$>(i) += b.at<https://urldefense.com/v3/__http:/b.at__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLBIWnTL5$>(i);
> 26 }
> 27
> 28 return 0;
> 29 }
>
> It's given that pointer a, in each object of the class RemotePtr, is the
> only pointer that can access the array pointed by it. So, I tried __remote
> int* __restrict a; (line 9) construct to tell Clang the same. This doesn't
> seem to work and I see no noliass in the generated IR. Specifically, I want
> lines 23-26 optimized assuming no aliasing between A and B. Any reason why
> Clang shouldn't annotate memory accesses in lines 23-26 with noaliass
> taking line 9 into account?
>
> The higher level problem is this: is there a way to compile lines 23-26
> assuming no aliasing between A and B, by just doing something in the
> RemotePtr class (so that main is clear of ugly code)? If that's not
> possible, is there a way to tell Clang that lines 23-26 should assume no
> aliasing at all, by some pragma?
>
> Thank you,
> Bandhav
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLD9-lkj8$>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLD9-lkj8$>
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLD9-lkj8$>
--
[Image removed by sender.]
Neil Henning
Senior Software Engineer Compiler
unity.com<https://urldefense.com/v3/__http:/unity.com__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLJhMkp2Z$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200624/a58dbae4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 380 bytes
Desc: image001.jpg
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200624/a58dbae4/attachment-0001.jpg>
More information about the llvm-dev
mailing list