[llvm-dev] Question about store with unaligned memory address

jingu kang via llvm-dev llvm-dev at lists.llvm.org
Tue Feb 2 06:18:35 PST 2016


Hi Bruce,

Thanks for response.

I think you mean custom lowering of vector type store using algorithm
you mention . It can avoid the situation as I mentioned on previous
e-mail. But I am not 100% sure whether the only vector store generates
unchained stores. I have seen 'getMemcpyLoadsAndStores' also generates
them. It can also be avoided with something like 'MaxStoresPerMemcpy =
0'. I wanted to solve the problem in each store level like lowering
store or making pseudo machine instruction.

Thanks,
JinGu Kang

2016-02-02 12:39 GMT+00:00 Bruce Hoult <bruce at hoult.org>:
> Sorry .. *two* elt loads, one elt store, and an aligned vector store.
>
>
> On Tue, Feb 2, 2016 at 3:30 PM, Bruce Hoult <bruce at hoult.org> wrote:
>>
>> I'm afraid I'm not yet skilled with LLVL IR, but I think you'd probably
>> want to do something like this:
>>
>> #include <stdint.h>
>>
>> typedef uint32_t elt;
>> typedef elt v4 __attribute__ ((vector_size (16)));
>>
>> void store(v4 *p, v4 v){
>>   uintptr_t align = uintptr_t(p) & (sizeof(elt)-1);
>>   if (align == 0){
>>     *p = v;
>>   } else {
>>     // assert sizeof(elt) == 4
>>     // assert align == 2
>>     elt *base = (elt*)(uintptr_t(p) - 2);
>>     elt lomask = (1<<16)-1;
>>     elt himask = ~lomask;
>>     base[0] = (base[0] & lomask) | v[0] << 16;
>>     base[1] = v[0] >> 16 | v[1] << 16;
>>     base[2] = v[1] >> 16 | v[2] << 16;
>>     base[3] = v[2] >> 16 | v[3] << 16;
>>     base[4] = v[3] >> 16 | (base[4] & himask);
>>   }
>> }
>>
>> Of course you'll want to convert one LLVM IR to another, not actually
>> insert C code, but you can compile this code and see what IR is produced.
>>
>> Interestingly, LLVM manages to optimize this to one elt load, one elt
>> store, and an aligned vector store. Plus some in-registers shuffling, of
>> course.
>>
>>
>> On Mon, Feb 1, 2016 at 5:49 PM, jingu kang <jaykang10 at gmail.com> wrote:
>>>
>>> Hi Bruce,
>>>
>>> Thanks for response.
>>>
>>> I also think it is not good way. Do you have the other ways to legalize
>>> it?
>>>
>>> Thanks,
>>> JinGu Kang
>>>
>>> 2016-02-01 13:11 GMT+00:00 Bruce Hoult <bruce at hoult.org>:
>>> > In fact this is a pretty bad legalizing/lowering because you only need
>>> > to
>>> > load and edit for the first and last values in the vector. The other
>>> > words
>>> > are completely replaced and don't need to be loaded at all.
>>> >
>>> > I think you need to legalize differently when it is not aligned.
>>> >
>>> > On Fri, Jan 29, 2016 at 10:07 PM, jingu kang via llvm-dev
>>> > <llvm-dev at lists.llvm.org> wrote:
>>> >>
>>> >> Hi Krzysztof,
>>> >>
>>> >> Thanks for response.
>>> >>
>>> >> The method is working almost of test cases which use load and store
>>> >> instructions connected with chain. There is other situation. Let's
>>> >> look at a example as follows:
>>> >>
>>> >> typedef unsigned short int UV __attribute__((vector_size (8)));
>>> >>
>>> >> void test (UV *x, UV *y) {
>>> >>   *x = *y / ((UV) { 4, 4, 4, 4 });
>>> >>  }
>>> >>
>>> >> The target does not support vector type so CodeGen tries to split and
>>> >> scalarize vector to legalize type. While legalizing vector type, the
>>> >> stores of each vector elements nodes are generated from
>>> >> 'DAGTypeLegalizer::SplitVecOp_STORE'. But the stores are not connected
>>> >> with chain. I guess it assumes each vector element's address is
>>> >> different. The each store is lowered to load and store nodes with high
>>> >> and low address but they are not connected with the other store's one.
>>> >> It causes problem. I am not sure how to solve this situation
>>> >> correctly.
>>> >>
>>> >> Thanks,
>>> >> JinGu Kang
>>> >>
>>> >>
>>> >> 2016-01-29 18:11 GMT+00:00 Krzysztof Parzyszek via llvm-dev
>>> >> <llvm-dev at lists.llvm.org>:
>>> >> > On 1/29/2016 10:47 AM, JinGu Kang via llvm-dev wrote:
>>> >> >>
>>> >> >>
>>> >> >> I am doing it with lowering store as follow:
>>> >> >>
>>> >> >> 1. make low and high address with alignment.
>>> >> >> 2. load 2 words from low and high address.
>>> >> >> 3. manipulate them with values to store according to alignment.
>>> >> >> 4. store 2 words modified to low and high address
>>> >> >
>>> >> >
>>> >> > Sounds ok.
>>> >> >
>>> >> >
>>> >> >> In order to keep the order between loads and stores, I have used
>>> >> >> chain
>>> >> >> and
>>> >> >> glue on the DAG but some passes have mixed it in machine
>>> >> >> instruction
>>> >> >> level.
>>> >> >
>>> >> >
>>> >> > Glue isn't necessary, chains are sufficient.
>>> >> >
>>> >> > I'm not sure what pass reordered dependent loads and stores, but
>>> >> > that
>>> >> > sounds
>>> >> > bad.  What matters in cases like this are the MachineMemOperands.
>>> >> > If
>>> >> > there
>>> >> > isn't any on a load/store instruction, it should be treated
>>> >> > conservatively
>>> >> > (i.e. alias everything else), if there is one, it'd better be
>>> >> > correct.
>>> >> > Wrong MMO could certainly lead to such behavior.
>>> >> >
>>> >> > -Krzysztof
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>> >> > hosted by
>>> >> > The Linux Foundation
>>> >> >
>>> >> > _______________________________________________
>>> >> > LLVM Developers mailing list
>>> >> > llvm-dev at lists.llvm.org
>>> >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> >> _______________________________________________
>>> >> LLVM Developers mailing list
>>> >> llvm-dev at lists.llvm.org
>>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> >
>>> >
>>>
>>> --
>>> This message has been scanned for viruses and
>>> dangerous content by MailScanner, and is
>>> believed to be clean.
>>>
>>
>


More information about the llvm-dev mailing list