[llvm-dev] RFC: code size reduction in X86 by replacing EVEX with VEX encoding

Wed Nov 23 05:01:18 PST 2016

----- Original Message -----

> From: "Gadi via llvm-dev Haber" <llvm-dev at lists.llvm.org>
> To: llvm-dev at lists.llvm.org
> Sent: Wednesday, November 23, 2016 5:50:42 AM
> Subject: [llvm-dev] RFC: code size reduction in X86 by replacing EVEX
> with VEX encoding

> Hi All.

> This is an RFC for a proposed target specific X86 optimization for
> reducing code size in the encoding of AVX-512 instructions when
> possible.

> When the AVX512F instruction set was introduced in X86 it included
> additional 32 registers of 512bit size each ZMM0 - ZMM31, as well as
> additional 16 XMM registers XMM16-XMM31 and 16 YMM registers
> YMM16-YMM31.
> In order to encode the new registers of 16-31 and the additional
> instructions, a new encoding prefix called EVEX , which extends the
> existing VEX encoding , was introduced as shown below:

> The EVEX encoding format:
> EVEX Opcode ModR/M [SIB] [Disp32] / [Disp8*N] [Immediate]
> # of bytes: 4 1 1 1 4 / 1 1

> The existing VEX encoding format:
> [VEX] OPCODE ModR/M [SIB] [DISP] [IMM]
> # of bytes: 0,2,3 1 1 0,1 0,1,2,4 0,1

> Note that the EVEX prefix requires 4 bytes whereas the VEX prefix can
> take only up to 3 bytes.
> Consequently, for the SKX architecture, many instructions that use
> only the lower registers of XMM0-XMM15 or YMM0-YMM15, can be encoded
> by either the EVEX or the VEX format. For such cases, using the VEX
> encoding results in a code size reduction of ~2 bytes even though it
> is compiled with the AVX512F/AVX512VL features enabled.

> For example: “vmovss %xmm0, 32(%rsp,%rax,4)“, has the following 2
> possible encodings:

> EVEX encoding (8 bytes long):
> 62 f1 7e 08 11 44 84 08 vmovss %xmm0, 32(%rsp,%rax,4)

> VEX encoding (6 bytes long):
> c5 fa 11 44 84 20 vmovss %xmm0, 32(%rsp,%rax,4)

> See reported Bugzilla bugs about this proposed optimization:
> https://llvm.org/bugs/show_bug.cgi?id=23376
> https://llvm.org/bugs/show_bug.cgi?id=29162

> The proposed optimization implementation is to add a table of all
> EVEX opcodes that can be encoded via VEX in a new header file placed
> under lib/Target/X86.
> A new pass is to be added at the pre-emit stage .
It might be better to have TableGen generate the mapping table for you instead of manually making a table yourself. TableGen has a feature that is specifically designed to make mapping tables like this. For examples, grep for InstrMapping in: 

lib/Target/Hexagon/Hexagon.td 
lib/Target/Mips/MipsDSPInstrFormats.td 
lib/Target/Mips/MipsInstrFormats.td 
lib/Target/Mips/Mips32r6InstrFormats.td 
lib/Target/PowerPC/PPC.td 
lib/Target/AMDGPU/SIInstrInfo.td 
lib/Target/AMDGPU/R600Instructions.td 
lib/Target/SystemZ/SystemZInstrFormats.td 
lib/Target/Lanai/LanaiInstrInfo.td 

I've used this feature a few times in the PowerPC backend, and it's quite convenient. 

-Hal 

> No need for special Opt flags, as it is always better to use the
> reduced VEX encoding when possible.

> Thank you for any comments or questions that you may have.

> Sincerely,

> Gadi.

> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161123/7af54779/attachment.html>