[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths

Nicolas Vasilache via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 9 00:57:43 PST 2021


Hi everyone,

I am experimenting with LLVM lowering, intrinsics and shufflevector in
general.

Here is an IR that I produce with the objective of emitting some vblendps
instructions:
https://gist.github.com/nicolasvasilache/0fe30c83cbfe5b4776ec9f0ee465611a.

I compile this further with

clang -x ir -emit-llvm -S -mcpu=haswell -O3 -o - | llc -O3 -mcpu=haswell -
-o -

to obtain:

https://gist.github.com/nicolasvasilache/2c773b86fcda01cc28711828a0a9ce0a

At this point, I would expect to see some vblendps instructions generated
for the pieces of IR that produce %48/%49 %51/%52 %54/%55 and %57/%58 to
reduce pressure on port 5 (vblendps can also go on ports 0 and 1). However
the expected instruction does not get generated and llvm-mca continues to
show me high port 5 contention.

Could people suggest some steps / commands to help better understand why my
expectation is not met and whether I can do something to make the compiler
generate what I want? Thanks in advance!

I have verified independently that in isolation, a single such shuffle
creates a vblendps. I see them being recombined in the produced assembly
and I am looking for experimenting with avoiding that vshufps + vblendps +
vblendps get recombined into vunpckxxx + vunpckxxx instructions.

-- 
N
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211109/da51e793/attachment.html>


More information about the llvm-dev mailing list