[PATCH] D93229: [VectorCombine] allow peeking through GEPs when creating a vector load
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 14 11:12:48 PST 2020
lebedev.ri added a comment.
In D93229#2452751 <https://reviews.llvm.org/D93229#2452751>, @spatel wrote:
> In D93229#2452695 <https://reviews.llvm.org/D93229#2452695>, @lebedev.ri wrote:
>
>> I'm having trouble coming up with an example because there appears to be a preexisting soundness problems, example: (CC @nlopes @aqjune)
>>
>> define <8 x i16> @t(i8* align 128 dereferenceable(128) %base) {
>> %ptr = getelementptr inbounds i8, i8* %base, i64 1
>> %p = bitcast i8* %ptr to <8 x i16>*
>>
>> %gep = getelementptr inbounds <8 x i16>, <8 x i16>* %p, i64 0, i64 1
>> %s = load i16, i16* %gep, align 1
>> %r = insertelement <8 x i16> undef, i16 %s, i64 0
>> ret <8 x i16> %r
>> }
>>
>> /builddirs/llvm-project/build-Clang11-unknown$ /builddirs/llvm-project/build-Clang11-unknown/bin/opt -load /repositories/alive2/build-Clang-release/tv/tv.so -tv -vector-combine -mtriple=x86_64-- -mattr=avx2 -tv -o /dev/null --tv-smt-to=60000 /tmp/D93229.ll
>>
>> ----------------------------------------
>> define <8 x i16> @t(* dereferenceable(128) align(128) %base) {
>> %0:
>> %ptr = gep inbounds * dereferenceable(128) align(128) %base, 1 x i64 1
>> %p = bitcast * %ptr to *
>> %gep = gep inbounds * %p, 16 x i64 0, 2 x i64 1
>> %s = load i16, * %gep, align 1
>> %r = insertelement <8 x i16> undef, i16 %s, i64 0
>> ret <8 x i16> %r
>> }
>> =>
>> define <8 x i16> @t(* dereferenceable(128) align(128) %base) {
>> %0:
>> %ptr = gep inbounds * dereferenceable(128) align(128) %base, 1 x i64 1
>> %p = bitcast * %ptr to *
>> %gep = gep inbounds * %p, 16 x i64 0, 2 x i64 1
>> %1 = bitcast * %gep to *
>> %r = load <8 x i16>, * %1, align 1
>> ret <8 x i16> %r
>> }
>> Transformation doesn't verify!
>> ERROR: Target is more poisonous than source
>>
>> Example:
>> * dereferenceable(128) align(128) %base = pointer(non-local, block_id=1, offset=1664)
>>
>> Source:
>> * %ptr = pointer(non-local, block_id=1, offset=1665)
>> * %p = pointer(non-local, block_id=1, offset=1665)
>> * %gep = pointer(non-local, block_id=1, offset=1667)
>> i16 %s = poison
>> <8 x i16> %r = < poison, any, any, any, any, any, any, any >
>>
>> SOURCE MEMORY STATE
>> ===================
>> NON-LOCAL BLOCKS:
>> Block 0 > size: 0 align: 1 alloc type: 0
>> Block 1 > size: 2048 align: 128 alloc type: 0
>>
>> Target:
>> * %ptr = pointer(non-local, block_id=1, offset=1665)
>> * %p = pointer(non-local, block_id=1, offset=1665)
>> * %gep = pointer(non-local, block_id=1, offset=1667)
>> * %1 = pointer(non-local, block_id=1, offset=1667)
>> <8 x i16> %r = < poison, poison, poison, poison, poison, poison, poison, poison >
>> Source value: < poison, any, any, any, any, any, any, any >
>> Target value: < poison, poison, poison, poison, poison, poison, poison, poison >
>>
>> Alive2: Transform doesn't verify!
>
> IIUC, this is a question of allowing poison (from the unused loaded memory elements) to propagate?
> So we have to freeze or explicitly make those elements undef again?
> https://alive2.llvm.org/ce/z/LKqBVW
That is how i read it, yes. That will be gone in codegen, so no need to cost that extra legality shuffle.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D93229/new/
https://reviews.llvm.org/D93229
More information about the llvm-commits
mailing list