[llvm-dev] Vectorizer has trouble with vpmovmskb and store
Johan Engelen via llvm-dev
llvm-dev at lists.llvm.org
Mon Nov 26 14:50:52 PST 2018
Hi all,
I've run into a case where the optimizer seems to be having trouble doing
the "obvious" thing.
Consider this code:
```
define i16 @foo(<8 x i16>* dereferenceable(16) %egress, <16 x i8> %a0) {
%a1 = icmp slt <16 x i8> %a0, zeroinitializer
%a2 = bitcast <16 x i1> %a1 to i16
%astore = getelementptr inbounds <8 x i16>, <8 x i16>* %egress, i64 0,
i64 7
;store i16 %a2, i16* %astore
ret i16 %a2
}
```
The optimizer recognizes this and llc nicely outputs a vpmovmskb
instruction:
```
foo: # @foo
vpmovmskb eax, xmm0
ret
```
Writing to the output vector also works well:
```
define void @writing(<8 x i16>* dereferenceable(16) %egress, <16 x i8> %a0)
{
%astore = getelementptr inbounds <8 x i16>, <8 x i16>* %egress, i64 0,
i64 7
store i16 123, i16* %astore
ret void
}
```
outputs:
```
writing: # @writing
mov word ptr [rdi + 14], 123
ret
```
Now, combining these two by uncommenting the store in `foo()` suddenly
results in a very large function, instead of just:
vpmovmskb eax, xmm0
mov word ptr [rdi + 14], ax
ret
Is there something wrong with my IR code, or is the optimizer somehow
confused? Can I rewrite the code such that the optimizer does understand?
Godbolt link: https://llvm.godbolt.org/z/OgExDk
Thanks a lot for the help.
Cheers,
Johan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181126/5c15b7b0/attachment-0001.html>
More information about the llvm-dev
mailing list