[LLVMdev] Adding masked vector load and store intrinsics
Smith, Kevin B
kevin.b.smith at intel.com
Fri Oct 24 12:40:52 PDT 2014
> How would one express such semantics in LLVM IR with this intrinsic? By definition, %data anmd %passthrough are different IR virtual registers and there are no copy instructions in LLVM IR.
You never need to express this semantic in LLVM IR, because in SSA form they are always different SSA defs for the result of the operation versus the inputs to the operation. Someplace late in the CG needs to handle
this, in exactly an analogous fashion as it already has to handle this for mapping to regular X86 two address code.
For example, this LLVM IR
%add = add nsw i32 %b, %a
gets converted into
# *** IR Dump After Expand ISel Pseudo-instructions ***:
# Machine code for function foo: SSA
Function Live Ins: %EDI in %vreg0, %ESI in %vreg1
BB#0: derived from LLVM BB %entry
Live Ins: %EDI %ESI
%vreg1<def> = COPY %ESI; GR32:%vreg1
%vreg0<def> = COPY %EDI; GR32:%vreg0
%vreg2<def,tied1> = ADD32rr %vreg1<tied0>, %vreg0, %EFLAGS<imp-def,dead>
in ISEL. So, the necessary instruction semantic needn't be represented in LLVM IR. It is created once you have to do mapping to "real" machine instructions using virtual registers, where copies, and the ability to mark a destination and a
source as "tied" together are representable.
From: dag at cray.com [mailto:dag at cray.com]
Sent: Friday, October 24, 2014 12:23 PM
To: Smith, Kevin B
Cc: Demikhovsky, Elena; llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics
"Smith, Kevin B" <kevin.b.smith at intel.com> writes:
>> So %passthrough can *only* be undef or zeroinitializer?
> No, that wasn't the intent. %passthrough can be any other definition
> that is needed. Zero and undef were simply two possible values that
> illustrated some interesting behavior.
> Mapping of the %passthrough to the actual semantics of many vector
> instruction sets where the masked instructions leave the masked-off
> elements of the destination unchanged is done in a similar manner as
> three-address instructions are turned into two address instructions,
> by placing a copy as necessary so that dest and passthrough are in the
> same register.
How would one express such semantics in LLVM IR with this intrinsic? By
definition, %data anmd %passthrough are different IR virtual registers
and there are no copy instructions in LLVM IR.
In the more general case:
%b = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x i32> %a, i32 4, <8 x i1> %mask)
where %a and %b have no relation to each other, I presume the backend
would be responsible for doing a select/merge after the load if the ISA
didn't directly support the merge as part of the load operation. Right?
More information about the llvm-dev