[llvm-dev] llvm-ir: TBAA and struct copies
Jeroen Dobbelaere via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 6 07:50:09 PDT 2019
Thanks !
I'l try to come up with a patch to achieve this.
Greetings,
Jeroen Dobbelaere
> -----Original Message-----
> From: Ivan Kosarev <ivan at kosarev.info>
> Sent: Wednesday, June 5, 2019 15:39
> To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; llvm-
> dev at lists.llvm.org
> Cc: Ivan Kosarev <ikosarev at accesssoftek.com>
> Subject: Re: [llvm-dev] llvm-ir: TBAA and struct copies
>
> Hi Jeroen,
>
> > With the current understanding that I have about the problem and
> about the TBAA implementation, I am not sure that it is easy to fix with
> the default struct path tbaa.
>
> To handle aggregate accesses correctly, we need to know the size of the
> access in addition to the offset. Since the current format doesn't store
> sizes, it doesn't seem possible to do much more than we already do about
> aggregate accesses. This is further complicated by the fact that calls
> like memcpy() have their own format of metadata nodes.
>
> > For the 'NewFormat', The 'MayAlias' should be based on overlap of
> {OffsetInBase, BaseTag.getSize} and {SubobjecTag.getOffset(),
> SubobjectTag.getSize()}
> > I am not sure if the same holds for the 'GenericTag'.
>
> Correct. As to the generic tag, this needs some very careful analysis
> and extensive testing, but generally I would expect it to be built so that:
> - its base type is the matching type in the considered access chains,
> - its access type is the generic type of the two access types being
> matched and
> - its range (offset + size) is the union of the matched access ranges.
>
> Regards,
>
>
> On 05/06/2019 15:45, Jeroen Dobbelaere wrote:
> > Hi Ivan,
> >
> > The code that we have is indeed different from what the 'standard llvm'
> expects. Let me explain:
> >
> > in our version we came into this situation in two steps:
> > 1) I added support for 'special types' that map directly to types supported
> by hardware.
> > These types are represented by a struct containing a single iXXX member,
> providing the necessary bits
> > of the type, and at the same time avoiding any change in the bits (if it
> would have been represented
> > by a pure iXXX). They need to be handled similarly as a fundamental type,
> so we provided support for that,
> > resulting in clang generating load/store instructions that work on a 'struct'.
> >
> > In order to get this working (for tbaa), I had to relax the check for access
> type, as it now could see an aggregate type
> > (that behaved as a fundamental type).
> >
> > 2) later on, we noticed that sroa (if I remember correctly) produced better
> code if it could work on load/store instructions of aggregates instead of
> > llvm.memcpy. I adapted clang to also emit load/store pairs for small structs.
> Just like with the special 'struct type', we the load/store's are decorated with
> > the struct and the size of the struct as access TBAAAccessInfo information.
> >
> > At that time all looked fine, but recently, csmith testing has been showing
> cases where alias analysis is now too aggressive.
> >
> > An easy way out is of course to fall back to llvm.memcpy, but that blocks a
> lot of possible optimizations, as it loses all the TBAA information.
> >
> > (In the original example: because of the llvm.memcpy, it does no matter if
> the *p=y copy is for a 'S' or for a different 'struct T'. In both cases, it blocks
> the optimization of the 'x.f' )
> >
> > With the current understanding that I have about the problem and about
> the TBAA implementation, I am not sure that it is easy to fix with the default
> struct path tbaa.
> > With the 'new struct path tbaa', as it also tracks the access size, I do see a
> possible solution:
> >
> > in TypeBasedAliasAnalysis.cpp, in function bool
> mayBeAccessToSubobjectOf(...):
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__github.com_llvm_llvm-
> 2Dproject_blob_master_llvm_lib_Analysis_TypeBasedAliasAnalysis.cpp-
> 23L619&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk-
> OSzxlGOXXSfAvOLT6E8iPwwJk&m=wjZvxSyGva9_KdUwg3GJlKe3d-
> Xupw002Pbq1baYxCE&s=KoEC1PwWo0_GryWTGjQ5nhZdePTLJ8nlEcOOs0C_
> Ov8&e=
> >
> > if (BaseType.getNode() == SubobjectTag.getBaseType()) {
> > bool SameMemberAccess = OffsetInBase ==
> SubobjectTag.getOffset(); // **** HERE
> > if (GenericTag) {
> > *GenericTag = SameMemberAccess ? SubobjectTag.getNode() :
> > createAccessTag(CommonType);
> > }
> > MayAlias = SameMemberAccess;
> > return true;
> > }
> >
> > For the 'NewFormat', The 'MayAlias' should be based on overlap of
> {OffsetInBase, BaseTag.getSize} and {SubobjecTag.getOffset(),
> SubobjectTag.getSize()}
> > I am not sure if the same holds for the 'GenericTag'.
> >
> > Do you think that would cause other problems ?
> >
> > Thanks,
> >
> > Jeroen Dobbelaere
> >
> >> -----Original Message-----
> >> From: Ivan Kosarev <ivan at kosarev.info>
> >> Sent: Wednesday, June 5, 2019 12:25
> >> To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; llvm-
> >> dev at lists.llvm.org
> >> Cc: Ivan Kosarev <ikosarev at accesssoftek.com>
> >> Subject: Re: [llvm-dev] llvm-ir: TBAA and struct copies
> >>
> >> Hello Jeroen,
> >>
> >> AFAIR, with the current TBAA format the access type shall never be an
> >> aggregate, so the !10 node looks suspicious.
> >>
> >> Tried to reproduce this on 362464 with that same input snippet, but got
> >> something completely different, see below.
> >>
> >> Regards,
> >>
> >> ---
> >> $ clang -cc1 -S -emit-llvm -O1 -disable-llvm-passes x.cpp
> >> $ cat x.ll
> >> ...
> >> define i32 @_Z6foobarv() #0 {
> >> entry:
> >> store i16 42, i16* getelementptr inbounds (%struct.S, %struct.S* @x,
> >> i32 0, i32 2), align 2, !tbaa !2
> >> %0 = load %struct.S*, %struct.S** @p, align 8, !tbaa !8
> >> %1 = bitcast %struct.S* %0 to i8*
> >> call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 4
> >> bitcast (%struct.S* @y to i8*), i64 8, i1 false), !tbaa.struct !10
> >> %2 = load i16, i16* getelementptr inbounds (%struct.S, %struct.S* @x,
> >> i32 0, i32 2), align 2, !tbaa !2
> >> %conv = sext i16 %2 to i32
> >> ret i32 %conv
> >> }
> >> ...
> >> !2 = !{!3, !7, i64 6}
> >> !3 = !{!"_ZTS1S", !4, i64 0, !7, i64 4, !7, i64 6}
> >> !4 = !{!"int", !5, i64 0}
> >> !5 = !{!"omnipotent char", !6, i64 0}
> >> !6 = !{!"Simple C++ TBAA"}
> >> !7 = !{!"short", !5, i64 0}
> >> !8 = !{!9, !9, i64 0}
> >> !9 = !{!"any pointer", !5, i64 0}
> >> !10 = !{i64 0, i64 4, !11, i64 4, i64 2, !12, i64 6, i64 2, !12}
> >> !11 = !{!4, !4, i64 0}
> >> !12 = !{!7, !7, i64 0}
> >> ---
> >>
> >>
> >>
> >> On 04/06/2019 17:34, Jeroen Dobbelaere via llvm-dev wrote:
> >>> Hi,
> >>>
> >>> I have a question about the current definition of TBAA (See [1]).
> >>>
> >>> In the LLVM-IR code that we produce, we generate load/stores of struct
> >> types. (See [2] and [3] for a godbolt example showing the issue)
> >>> For following c-alike code:
> >>>
> >>> struct S { int dummy; short e, f; } x,y;
> >>> struct S* p = &x;
> >>> int foobar() {
> >>> x.f=42;
> >>> *p=y; //**** struct copy
> >>> return x.f;
> >>> }
> >>>
> >>> We produce:
> >>>
> >>> [...]
> >>> %struct.S = type { i32, i16, i16 }
> >>> ; Function Attrs: norecurse nounwind uwtable
> >>> define dso_local i32 @foobar() local_unnamed_addr #0 {
> >>> store i16 42, i16* getelementptr inbounds (%struct.S, %struct.S* @x,
> i64
> >> 0, i32 2), align 2, !tbaa !2
> >>> %1 = load %struct.S*, %struct.S** @p, align 8, !tbaa !8
> >>> %2 = load %struct.S, %struct.S* @y, align 8, !tbaa !10
> ;*************
> >>> store %struct.S %2, %struct.S* %1, align 4, !tbaa !10 ;
> *************
> >>> %3 = load i16, i16* getelementptr inbounds (%struct.S, %struct.S*
> @x,
> >> i64 0, i32 2), align 2, !tbaa !2
> >>> %4 = sext i16 %3 to i32
> >>> ret i32 %4
> >>> }
> >>> [...]
> >>> !0 = !{i32 1, !"wchar_size", i32 4}
> >>> !1 = !{!"clang version 9.0.0 (trunk 362464)"}
> >>> !2 = !{!3, !7, i64 6}
> >>> !3 = !{!"_ZTS1S", !4, i64 0, !7, i64 4, !7, i64 6}
> >>> !4 = !{!"int", !5, i64 0}
> >>> !5 = !{!"omnipotent char", !6, i64 0}
> >>> !6 = !{!"Simple C++ TBAA"}
> >>> !7 = !{!"short", !5, i64 0}
> >>> !8 = !{!9, !9, i64 0}
> >>> !9 = !{!"any pointer", !5, i64 0}
> >>> !10 = !{!3, !3, i64 0} ; ***********************
> >>>
> >>>
> >>> In order to allow other optimizations, we have also attached TBAA
> >> information to the load/store of the struct (!tbaa !10).
> >>> The logical solution would be to construct '!tbaa !10' as shown:
> >>> - the base type is '!3 (_ZTS1S)'
> >>> - the access type is also '!3 (_ZTS1S)'
> >>> - the offset is 'i64 0'
> >>>
> >>> I observe following issues:
> >>> - issue 1: the tbaa semantics will not detect aliasing between '!10' and
> '!2'
> >> (See [2] and [3]; the load i16 in bar_wrong should not be optimized away)
> >>> - issue 2: according to the pure definition of the 'access tags', the base
> >> type and the access type can not be the same struct type.
> >>> As such, the provided example could be found to be 'invalid'. Still, by
> >> adding an extra indirection, a similar 'valid' example (with wrong behavior)
> >>> can be constructed. See [3]
> >>>
> >>> - For 'issue 2', I think that the definition of 'access tag' should be
> relaxed,
> >> allowing the description of a full copy' of a struct.
> >>> - For 'issue 1', the lookup algorithm should be enhanced so that aliasing
> >> can be detected for these cases.
> >>> Is this a correct interpretation ? Any input is welcome !
> >>>
> >>> Thanks,
> >>>
> >>> Jeroen Dobbelaere
> >>> ----
> >>> [1] tbaa metadata: https://urldefense.proofpoint.com/v2/url?u=https-
> >> 3A__llvm.org_docs_LangRef.html-23tbaa-
> >>
> 2Dmetadata&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6
> >> UnFk-OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
> >> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=TmYSKjBxhgw5sSVeq-
> >> v7p1UmvzV_81OIMz-_pFakoZQ&e=
> >>> [2] godbolt example with single struct:
> >> https://urldefense.proofpoint.com/v2/url?u=https-
> >> 3A__www.godbolt.org_z_XcQy-
> >> 2DU&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk-
> >> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
> >> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=4OL_Z9OECS7bHERaUU-
> >> WMxWhDPt-f_17zpehoMGVgj4&e=
> >>> [3] godbolt example with nested struct:
> >> https://urldefense.proofpoint.com/v2/url?u=https-
> >> 3A__www.godbolt.org_z_mNd-
> >> 2DiI&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk-
> >> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
> >>
> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=0y6mskMKXX94CzNTFu1wVHF5m3Oc
> >> RPyoZBc-ShGZrqM&e=
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__lists.llvm.org_cgi-
> >> 2Dbin_mailman_listinfo_llvm-
> >>
> 2Ddev&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk-
> >> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
> >> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=410QYrHOxKaioN8wS4O3Sgfu9vh-
> >> 2Jnl5Z-xDqr94r0&e=
More information about the llvm-dev
mailing list