[llvm-dev] What should a truncating store do?
Friedman, Eli via llvm-dev
llvm-dev at lists.llvm.org
Mon Sep 25 11:08:01 PDT 2017
On 9/25/2017 9:14 AM, Björn Pettersson A wrote:
>
> (Not sure if this exactly maps to “truncating store”, but I think it
> at least touches some of the subjects discussed in this thread)
>
> Our out-of-tree-target need several patches to get things working
> correctly for us.
>
> We have introduced i24 and i40 types in ValueTypes/MachineValueTypes
> (in addition to the normal pow-of-2 types). And we have vectors of
> those (v2i40, v4i40).
>
> And the byte size in our target is 16 bits.
>
> When storing an i40 we need to store it as three 16-bit bytes, i.e. 48
> bits.
>
> When storing a v4i40 vector it will be stored as 4x48 bits.
>
> One thing that we have had to patch is the getStoreSize() method in
> ValueTypes/MachineValueTypes where we assume that vectors are
> bitpacked when the element size is smaller than the byte size
> (“BitsPerByte”):
>
> /// Return the number of bytes overwritten by a store of the
> specified value
>
> /// type.
>
> unsigned getStoreSize() const {
>
> - return (getSizeInBits() + 7) / 8;
>
> + // We assume that vectors with elements smaller than the byte
> size are
>
> + // bitpacked. And that elements larger than the byte size
> should be padded
>
> + // (e.g. i40 type for Phoenix is stored using 3 bytes (48 bits)).
>
> + bool PadElementsToByteSize =
>
> + isVector() && getScalarSizeInBits() >= BitsPerByte;
>
> + if (PadElementsToByteSize)
>
> + return getVectorNumElements() * getScalarType().getStoreSize();
>
> + return (getSizeInBits() + (BitsPerByte-1)) / BitsPerByte;
>
> }
>
> The patch seems to work for in-tree-target tests as well as our
> out-of-tree target.
>
> If it is a correct assumption for all targets is beyond my knowledge.
> Maybe only i1 vectors should be bitpacked?
>
> Anyway, I think the bitpacked cases is very special (we do not use it
> for our target…).
>
> AFAIK bitcast is defined as writing to memory followed by a load using
> a different type. And I think that doing several scalar operations
> should give the same result as when using vectors. So bitcast of
> bitpacked vectors should probably be avoided?
>
Yes, store+load is the right definition of bitcast. And in fact, the
backend will lower a bitcast to a store+load to a stack temporary in
cases where there isn't some other lowering specified.
The end result is probably going to be pretty inefficient unless your
target has a special instruction to handle it (x86 has pmovmskb for i1
vector bitcasts, but otherwise you probably end up with some terrible
lowering involving a lot of shifts).
> This also reminded me of the following test case that is in trunk:
> test/CodeGen/X86/pr20011.ll
>
> %destTy = type { i2, i2 }
>
> define void @crash(i64 %x0, i64 %y0, %destTy* nocapture %dest) nounwind {
>
> ; X64-LABEL: crash:
>
> ; X64: # BB#0:
>
> ; X64-NEXT: andl $3, %esi
>
> ; X64-NEXT: movb %sil, (%rdx)
>
> ; X64-NEXT: andl $3, %edi
>
> ; X64-NEXT: movb %dil, (%rdx)
>
> ; X64-NEXT: retq
>
> %x1 = trunc i64 %x0 to i2
>
> %y1 = trunc i64 %y0 to i2
>
> %1 = bitcast %destTy* %dest to <2 x i2>*
>
> %2 = insertelement <2 x i2> undef, i2 %x1, i32 0
>
> %3 = insertelement <2 x i2> %2, i2 %y1, i32 1
>
> store <2 x i2> %3, <2 x i2>* %1, align 1
>
> ret void
>
> }
>
> As you can see by the “X64” checks the behavior is quite weird.
>
> Both movb instructions writes to the same address. So the result of
> the store <2 x i2> will be the same as when only storing one of the
> elements.
>
> Is this really expected?
>
> We have emailed Simon Pilgrim who added the test case to show that we
> no longer crash on this test case (see
> https://bugs.llvm.org/show_bug.cgi?id=20011). But even if the compiler
> doesn’t crash, the behavior seems wrong to me.
>
Yes, the behavior here is wrong.
DAGTypeLegalizer::SplitVecOp_STORE/DAGTypeLegalizer::SplitVecRes_LOAD/etc.
assume the element size is a multiple of 8. I'm sure this has been
discussed before, but I guess nobody ever wrote a patch to fix it...?
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170925/ef96db6e/attachment.html>
More information about the llvm-dev
mailing list