[llvm-dev] Vectorizer has trouble with vpmovmskb and store

Johan Engelen via llvm-dev llvm-dev at lists.llvm.org
Mon Nov 26 14:50:52 PST 2018


Hi all,
  I've run into a case where the optimizer seems to be having trouble doing
the "obvious" thing.

Consider this code:
```
define i16 @foo(<8 x i16>* dereferenceable(16) %egress, <16 x i8> %a0) {
    %a1 = icmp slt <16 x i8> %a0, zeroinitializer
    %a2 = bitcast <16 x i1> %a1 to i16
    %astore = getelementptr inbounds <8 x i16>, <8 x i16>* %egress, i64 0,
i64 7
    ;store i16 %a2, i16* %astore
    ret i16 %a2
}
```
The optimizer recognizes this and llc nicely outputs a vpmovmskb
instruction:
```
foo: # @foo
    vpmovmskb eax, xmm0
    ret
```

Writing to the output vector also works well:
```
define void @writing(<8 x i16>* dereferenceable(16) %egress, <16 x i8> %a0)
{
    %astore = getelementptr inbounds <8 x i16>, <8 x i16>* %egress, i64 0,
i64 7
    store i16 123, i16* %astore
    ret void
}
```
outputs:
```
writing: # @writing
    mov word ptr [rdi + 14], 123
    ret
```

Now, combining these two by uncommenting the store in `foo()` suddenly
results in a very large function, instead of just:
    vpmovmskb eax, xmm0
    mov word ptr [rdi + 14], ax
    ret

Is there something wrong with my IR code, or is the optimizer somehow
confused? Can I rewrite the code such that the optimizer does understand?

Godbolt link: https://llvm.godbolt.org/z/OgExDk

Thanks a lot for the help.
Cheers,
  Johan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181126/5c15b7b0/attachment-0001.html>


More information about the llvm-dev mailing list