[llvm-dev] FW: Restrict qualifier on class members
Bandhav Veluri via llvm-dev
llvm-dev at lists.llvm.org
Wed Jun 24 01:11:41 PDT 2020
Hi Jeroen,
Sorry, I missed that. I tried the patch, and this program:
#include <stdint.h>
#define __remote __attribute__((address_space(1)))
__remote int* A;
__remote int* B;
void vec_add(__remote int* __restrict a,
__remote int* __restrict b,
int n) {
#pragma unroll 4
for(int i=0; i<n; ++i) {
a[i] += b[i];
}
}
int main(int argc, char** argv) {
__remote int* __restrict a = A;
__remote int* __restrict b = B;
#pragma unroll 4
for(int i=0; i<4; ++i) {
a[i] += b[i];
}
return 0;
}
vec_add give following schedule:
*** Final schedule for %bb.8 ***
SU(0): %33:gpr = LW %56:gpr, -8 :: (load 4 from %ir.scevgep8, !tbaa !14,
!noalias !13, addrspace 1)
SU(1): %34:gpr = LW %55:gpr, -8 :: (load 4 from %ir.scevgep14, !tbaa !14,
!noalias !13, addrspace 1)
SU(4): %36:gpr = LW %56:gpr, -4 :: (load 4 from %ir.scevgep10, !tbaa !14,
!noalias !13, addrspace 1)
SU(5): %37:gpr = LW %55:gpr, -4 :: (load 4 from %ir.scevgep16, !tbaa !14,
!noalias !13, addrspace 1)
SU(8): %39:gpr = LW %56:gpr, 0 :: (load 4 from %ir.lsr.iv6, !tbaa !14,
!noalias !13, addrspace 1)
SU(9): %40:gpr = LW %55:gpr, 0 :: (load 4 from %ir.lsr.iv12, !tbaa !14,
!noalias !13, addrspace 1)
SU(12): %42:gpr = LW %56:gpr, 4 :: (load 4 from %ir.scevgep9, !tbaa !14,
!noalias !13, addrspace 1)
SU(13): %43:gpr = LW %55:gpr, 4 :: (load 4 from %ir.scevgep15, !tbaa !14,
!noalias !13, addrspace 1)
SU(2): %35:gpr = nsw ADD %34:gpr, %33:gpr
SU(3): SW %35:gpr, %55:gpr, -8 :: (store 4 into %ir.scevgep14, !tbaa !14,
!noalias !13, addrspace 1)
SU(6): %38:gpr = nsw ADD %37:gpr, %36:gpr
SU(7): SW %38:gpr, %55:gpr, -4 :: (store 4 into %ir.scevgep16, !tbaa !14,
!noalias !13, addrspace 1)
SU(10): %41:gpr = nsw ADD %40:gpr, %39:gpr
SU(11): SW %41:gpr, %55:gpr, 0 :: (store 4 into %ir.lsr.iv12, !tbaa !14,
!noalias !13, addrspace 1)
SU(14): %44:gpr = nsw ADD %43:gpr, %42:gpr
SU(15): SW %44:gpr, %55:gpr, 4 :: (store 4 into %ir.scevgep15, !tbaa !14,
!noalias !13, addrspace 1)
SU(16): %57:gpr = nuw nsw ADDI %57:gpr, 4
SU(17): %56:gpr = ADDI %56:gpr, 16
SU(18): %55:gpr = ADDI %55:gpr, 16
And main gives following schedule:
*** Final schedule for %bb.0 ***
SU(0): %2:gpr = LUI target-flags(riscv-hi) @A
SU(2): %4:gpr = LUI target-flags(riscv-hi) @B
SU(3): %5:gpr = LW %4:gpr, target-flags(riscv-lo) @B :: (dereferenceable
load 4 from @B, !tbaa !9, !noalias !22)
SU(1): %3:gpr = LW %2:gpr, target-flags(riscv-lo) @A :: (dereferenceable
load 4 from @A, !tbaa !9, !noalias !22)
SU(4): %6:gpr = LW %5:gpr, 0 :: (load 4 from %ir.3, !tbaa !14, !noalias
!22, addrspace 1)
SU(5): %7:gpr = LW %3:gpr, 0 :: (load 4 from %ir.1, !tbaa !14, !noalias
!22, addrspace 1)
SU(6): %8:gpr = nsw ADD %7:gpr, %6:gpr
SU(7): SW %8:gpr, %3:gpr, 0 :: (store 4 into %ir.1, !tbaa !14, !noalias
!22, addrspace 1)
SU(8): %9:gpr = LW %5:gpr, 4 :: (load 4 from %ir.arrayidx.1, !tbaa !14,
!noalias !22, addrspace 1)
SU(9): %10:gpr = LW %3:gpr, 4 :: (load 4 from %ir.arrayidx1.1, !tbaa
!14, !noalias !22, addrspace 1)
SU(10): %11:gpr = nsw ADD %10:gpr, %9:gpr
SU(11): SW %11:gpr, %3:gpr, 4 :: (store 4 into %ir.arrayidx1.1, !tbaa
!14, !noalias !22, addrspace 1)
SU(12): %12:gpr = LW %5:gpr, 8 :: (load 4 from %ir.arrayidx.2, !tbaa
!14, !noalias !22, addrspace 1)
SU(13): %13:gpr = LW %3:gpr, 8 :: (load 4 from %ir.arrayidx1.2, !tbaa
!14, !noalias !22, addrspace 1)
SU(14): %14:gpr = nsw ADD %13:gpr, %12:gpr
SU(15): SW %14:gpr, %3:gpr, 8 :: (store 4 into %ir.arrayidx1.2, !tbaa
!14, !noalias !22, addrspace 1)
SU(16): %15:gpr = LW %5:gpr, 12 :: (load 4 from %ir.arrayidx.3, !tbaa
!14, !noalias !22, addrspace 1)
SU(17): %16:gpr = LW %3:gpr, 12 :: (load 4 from %ir.arrayidx1.3, !tbaa
!14, !noalias !22, addrspace 1)
SU(18): %17:gpr = nsw ADD %16:gpr, %15:gpr
SU(20): $x10 = COPY $x0
SU(19): SW %17:gpr, %3:gpr, 12 :: (store 4 into %ir.arrayidx1.3, !tbaa
!14, !noalias !22, addrspace 1)
This is great! Memory accesses are marked noalias. I wanted memory accesses
to be annotated as noalias to basically remove loop-carried dependencies so
that I can reorder them for efficient scheduling. But when I look at
Schedule DAG,
For vec_add I see something like this (note BotQ.A, scheduler can choose
any of those => no loop carried dependence):
- Latency limited.
** ScheduleDAGMILive::schedule picking next node
Queue BotQ.P:
Queue BotQ.A: 16 15 11 7 3
Cand SU(16) ORDER
Pick Bot ORDER
For main, at best I see something like this:
** ScheduleDAGMILive::schedule picking next node
Cycle: 45 BotQ.A
Queue BotQ.P:
Queue BotQ.A: 12 13
Cand SU(12) ORDER
Cand SU(13) ORDER
In theory, schedules for vec_add and main should be the same right? Is
there anything else I should do to make the __restrict remove loop-carried
dependence in main?
Attaching IR and scheduler log for reference...
On Mon, Jun 22, 2020 at 3:03 PM Jeroen Dobbelaere <
Jeroen.Dobbelaere at synopsys.com> wrote:
> Hi Bandhav,
>
>
>
> as mentioned in the summary of https://reviews.llvm.org/D69542 :
>
>
>
> The base version is b2a37cfe2bda0bc8c4d2e981922b5ac59c429bdc
> <https://reviews.llvm.org/rGb2a37cfe2bda0bc8c4d2e981922b5ac59c429bdc>
> (June 12, 2020)
>
>
>
> Greetings,
>
>
>
> Jeroen Dobbelaere
>
>
>
>
>
>
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Bandhav
> Veluri via llvm-dev
> *Sent:* Monday, June 22, 2020 18:32
> *To:* Neil Henning <neil.henning at unity3d.com>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>; Kruse, Michael <
> michael.kruse at anl.gov>
> *Subject:* Re: [llvm-dev] Restrict qualifier on class members
>
>
>
> Hi Jeroen,
>
>
>
> That's great! I was trying to use the patch, what's the latest version of
> the project we could apply it on?
>
>
>
> Hi Neil,
>
>
>
> That seems like what I can do as well! Do you happen to have some examples
> lying around? Maybe a pointer to the planned presentation, if that's okay?
>
>
>
> Thank you,
>
> Bandhav
>
>
>
> On Mon, Jun 22, 2020 at 1:55 AM Neil Henning <neil.henning at unity3d.com>
> wrote:
>
> I was originally going to cover this in my now defunct EuroLLVM talk
> but... we had this exact same problem on Unity's HPC# Burst compiler - how
> to track no-aliasing on structs. We were constrained in that we had to make
> it work with LLVM versions all the way back to shipped LLVM 6, so what we
> did was:
>
> - Add module-level metadata that tracked whether a given struct member
> field was no-alias.
> - Added our own alias analysis using createExternalAAWrapperPass to
> register it in the pass pipeline.
>
> This allowed us to have zero modifications to LLVM and do something useful
> with aliasing. The one 'issue' with it is if you have a stack-allocated
> struct that is SROA'ed you will lose the info that it was a struct, or if
> you are in a private/internal linkage function that has the struct as an
> argument, the opt passes can modify the function signature to lose the
> struct too. We had to do some mitigations here to get perfect aliasing on
> our usecases.
>
>
>
> Hope this helps,
>
> -Neil.
>
>
>
> On Mon, Jun 22, 2020 at 5:44 AM Michael Kruse via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Unfortunately
> https://llvm.org/docs/LangRef.html#llvm-loop-parallel-accesses-metadata
> <https://urldefense.com/v3/__https:/llvm.org/docs/LangRef.html*llvm-loop-parallel-accesses-metadata__;Iw!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLJxlIVzl$>
> is not a solution here. A loop-parallel access does not imply
> non-aliasing. The obvious case is when only reading from a location,
> but even when a location is written to I'd be careful to deduce that
> they do not alias since it might be a "benign data race" or the value
> never used. Additionally, LLVM's loop unroller is known to now handle
> noalias metadata correctly as it just copies it.
>
> There has been a discussion here:
> http://lists.llvm.org/pipermail/llvm-dev/2020-May/141587.html
> <https://urldefense.com/v3/__http:/lists.llvm.org/pipermail/llvm-dev/2020-May/141587.html__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLAvIBb7x$>
>
> Michael
>
>
> Am So., 21. Juni 2020 um 12:24 Uhr schrieb Johannes Doerfert via
> llvm-dev <llvm-dev at lists.llvm.org>:
> >
> > Hi Bandhav,
> >
> >
> > Jeroen Dobbelaere (CC'ed) is currently working on support for restrict
> qualified local variables and struct members.
> >
> > The patches exist but are not merged yet. If you want to give it a try
> apply https://reviews.llvm.org/D69542
> <https://urldefense.com/v3/__https:/reviews.llvm.org/D69542__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLLzXOdAP$>
> .
> >
> >
> > Initially I could only think of this solution for your problem:
> https://godbolt.org/z/6WtPXJ
> <https://urldefense.com/v3/__https:/godbolt.org/z/6WtPXJ__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLEju0wQ6$>
> >
> > Michael (CC'ed) might now another annotation to get `llvm.access`
> metadata for the loop, which should do what you intend.
> >
> >
> > Cheers,
> >
> > Johannes
> >
> >
> > On 6/21/20 11:56 AM, Bandhav Veluri via llvm-dev wrote:
> >
> > Hi,
> >
> > I'm trying to abstract some special pointers with a class, like in the
> > example program below:
> >
> > 1 #define __remote __attribute__((address_space(1)))
> > 2 #include <stdint.h>
> > 3
> > 4 __remote int* A;
> > 5 __remote int* B;
> > 6
> > 7 class RemotePtr {
> > 8 private:
> > 9 __remote int* __restrict a;
> > 10
> > 11 public:
> > 12 RemotePtr(__remote int* a) : a(a) {}
> > 13
> > 14 __remote int& at(int n) {
> > 15 return a[n];
> > 16 }
> > 17 };
> > 18
> > 19 int main(int argc, char** argv) {
> > 20 RemotePtr a(A);
> > 21 RemotePtr b(B);
> > 22
> > 23 #pragma unroll 4
> > 24 for(int i=0; i<4; ++i) {
> > 25 a.at
> <https://urldefense.com/v3/__http:/a.at__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLOY-9qOq$>(i)
> += b.at
> <https://urldefense.com/v3/__http:/b.at__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLBIWnTL5$>
> (i);
> > 26 }
> > 27
> > 28 return 0;
> > 29 }
> >
> > It's given that pointer a, in each object of the class RemotePtr, is the
> > only pointer that can access the array pointed by it. So, I tried
> __remote
> > int* __restrict a; (line 9) construct to tell Clang the same. This
> doesn't
> > seem to work and I see no noliass in the generated IR. Specifically, I
> want
> > lines 23-26 optimized assuming no aliasing between A and B. Any reason
> why
> > Clang shouldn't annotate memory accesses in lines 23-26 with noaliass
> > taking line 9 into account?
> >
> > The higher level problem is this: is there a way to compile lines 23-26
> > assuming no aliasing between A and B, by just doing something in the
> > RemotePtr class (so that main is clear of ugly code)? If that's not
> > possible, is there a way to tell Clang that lines 23-26 should assume no
> > aliasing at all, by some pragma?
> >
> > Thank you,
> > Bandhav
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> <https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLD9-lkj8$>
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> <https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLD9-lkj8$>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> <https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLD9-lkj8$>
>
>
>
> --
>
> [image: Image removed by sender.]
>
> *Neil Henning*
>
> Senior Software Engineer Compiler
>
> unity.com
> <https://urldefense.com/v3/__http:/unity.com__;!!A4F2R9G_pg!KbQb7EC98K_vFVBRDQoSveXuMcvOvKcWvTers1QW_g1LmGPsOh-wI0r2mV_c2G1yLJhMkp2Z$>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200624/fbaeecf9/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 380 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200624/fbaeecf9/attachment-0001.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tmp.log
Type: text/x-log
Size: 79928 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200624/fbaeecf9/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tmp.ll
Type: application/octet-stream
Size: 11000 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200624/fbaeecf9/attachment-0001.obj>
More information about the llvm-dev
mailing list