[LLVMdev] [Mesa3d-dev] Folding vector instructions
Chris Lattner
clattner at apple.com
Tue Dec 30 17:11:27 PST 2008
On Dec 30, 2008, at 3:03 PM, Zack Rusin wrote:
> On Tuesday 30 December 2008 15:30:35 Chris Lattner wrote:
>> On Dec 30, 2008, at 6:39 AM, Corbin Simpson wrote:
>>>> However, the special instrucions cannot directly be mapped to LLVM
>>>> IR, like
>>>> "min", the conversion involves in 'extract' the vector, create
>>>> less-than-compare, create 'select' instruction, and create 'insert-
>>>> element'
>>>> instruction.
>>
>> Using scalar operations obviously works, but will probably produce
>> very inefficient code. One positive thing is that all target-
>> specific
>> operations of supported vector ISAs (Altivec and SSE[1-4] currently)
>> are exposed either through LLVM IR ops or through target-specific
>> builtins/intrinsics. This means that you can get access to all the
>> crazy SSE instructions, but it means that your codegen would have to
>> handle this target-specific code generation.
>
> I think Alex was referring here to a AOS layout which is completely
> not ready.
> The currently supported one is SOA layout which eliminates scalar
> operations.
Ok!
>> Sure, it would be very reasonable to make these target-specific
>> builtins when targeting a GPU, the same way we have target-specific
>> builtins for SSE.
>
> Actually currently the plan is to have essentially a "two pass" LLVM
> IR. I
> wanted the first one to never lower any of the GPU instructions so
> we'd have
> intrinsics or maybe even just function calls like gallium.lit,
> gallium.dot,
> gallium.noise and such. Then gallium should query the driver to
> figure out
> which instructions the GPU supports and runs our custom llvm
> lowering pass
> that decomposes those into things the GPU supports.
That makes a lot of sense. Note that there is no reason to use actual
LLVM intrinsics for this: naming them "gallium.lit" is just as good as
"llvm.gallium.lit" for example.
> Essentially I'd like to
> make as many complicated things in gallium as possible to make the
> GPU llvm
> backends in drivers as simple as possible and this would help us
> make the
> pattern matching in the generator /a lot/ easier (matching
> gallium.lit vs 9+
> instructions it would be be decomposed to) and give us a more
> generic GPU
> independent layer above. But that hasn't been done yet, I hope to be
> able to
> write that code while working on the OpenCL implementation for
> Gallium.
Makes sense. For the more complex functions (e.g. texture lookup) you
can also just compile C code to LLVM IR and use the LLVM inliner to
inline the code if you prefer.
-Chris
More information about the llvm-dev
mailing list