[LLVMdev] Limit loop vectorizer to SSE
Frank Winter
fwinter at jlab.org
Fri Nov 15 17:23:26 PST 2013
I confirm that r194876 fixes the issue, i.e. segfault not caused.
My program still passed 16 byte aligned pointers to the function
which the loop vectorizer processes successfully:
LV: Vector loop of width 8 costs: 1.
LV: Selecting VF = : 8.
LV: Found a vectorizable loop (8) in func_orig.ll
LV: Unroll Factor is 1
Since the program runs fine, it seems to be allowed for the CPU
to issue a vector load (8 floats) to a 16 byte aligned address (as
opposed to 32 byte aligned). Or does in fact the loop vectorizer
handle this case in the preamble and the vector.body issues only
32 byte aligned accesses. In which case I would align the payload
to 32 byte in order to save a little in the preamble.
Frank
On 15/11/13 18:15, Arnold Schwaighofer wrote:
> A fix for this is in r194876.
>
> Thanks for reporting this!
>
>
> On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote:
>
>> Nadav,
>>
>> I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorization. I can't tell if it's the loop vectorizer or the codegen at fault, but the alignment assumption seems to sneak in somewhere.
>>
>> v/r,
>> Josh
>>
>> [1] http://pastebin.com/kc95WtGG
>> [2] http://pastebin.com/VY3ZLVJK
>>
>>
>> On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com> wrote:
>>
>> On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> wrote:
>>
>>> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote:
>>> Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks!
>>>
>>> I think that's a fair question, and it's about safety. If you're getting this on the JIT, means we may be generating unsafe transformations on the vectorizer.
>>>
>>> Arnold, Nadav, I don't remember seeing code to generate any run-time alignment checks on the incoming pointer, is there such a thing? If not, shouldn't we add one?
>>
>> If the the vectorizer generates aligned memory accesses to unaligned addresses then this is a serious bug. But I don’t think that Josh said that the vectorizer generated aligned accesses to unaligned pointers.
>>
>> There is no point in LLVM checking for alignment because if the memory is unaligned then the program will crash. Users who want to crash with a readable error message can simply write code that checks the pointer (by masking the high bits and comparing to zero).
>>
>>
More information about the llvm-dev
mailing list