[LLVMdev] Does current LLVM target-independent code generator supports my strange chip?
Daniel M Gessel
gessel at apple.com
Sat Nov 22 09:37:12 PST 2008
On Nov 22, 2008, at 11:03 AM, Wei wrote:
> I have 24-bit integer operations as well as 24-bit floating point
> (s7.16) operations.
>
> The H/W supports load/store instructions, however, they does suggest
> us not to use these load/store instructions besides debugging purpose.
> That is to say, you can imagine we don't have load/store instructions,
> we don't have memory, we just have registers.
>
> I will run OpenGL shading laugnage programs on these chip.
GLSL doesn't have pointers, so no "generic" load + store simplifying
things.
Unextended GLSL only requires support for integers in the 16 bit
range, and has no bitwise operations. It also doesn't specify integer
overflow behavior in any way.
The machines I worked with didn't support any integer ops, but GLSL
let us get by with "emulated" 16 bit integers (storing and operating
on them as floating point; divides required truncation after the op -
that sort of thing).
Since you have 24 bit integer operations, you're in better shape.
> About your comments, I (a new LLVM user) have some more questions:
>
> 1) You mention "custom handle the conversion of the integer/float
> constants that LLVM spits out", does it means:
> I have to register a callback function which will operate when LLVM
> wants to spits out a constant value to memory. But what about non-
> constant value?
What I mean is that you can probably get away with LLVM working with
float literals as f32, then converting them to your 24 bit format
during code gen. The specifics depend on how you want to handle
constants in your backend: literals in instructions or a constant pool
are the options I know of. For now, I'm using special "load literal"
instructions, but a constant pool may be more appropriate in the long
run. I'm still learning.
Integers too: let LLVM work with i32 internally, and convert literals
during code gen.
Since GLSL doesn't require load/store, and it sounds like your HW may
not 100% reliable for these ops, you want to make sure your code stays
in registers.
I assume you'll be starting with the reference GLSL parser (from
3DLabs, IIRC - I don't even know if they stil exist, actually) and
having it generate LLVM IR (has anybody done this before?). This will
give you much more control over the code - Clang is the front end for
the project I'm working on, and it generates stack based code; most of
the stack operations get optimized out by inlining and the mem2reg
pass, but not everything.
> ex:
> int a;
> and LLVM wants to put a into memory.
>
> and I don't really know what the "i32/f32 sounds a good place to
> start" means...
I mean that having your registers declared as i32 + f32 will probably
work out well, especially since you don't have pointers in your
language.
The issue would be that LLVM would want to store register values as 32
bits - and do all the pointer math that way. Depending on how your HW
works, this may or may not be okay. Even then, you might be able to
patch it up if you really needed to store your registers 3 byte aligned.
Fortunately, this is not an issue with GLSL.
> 2) I don't know why you mention "I'd assume you'd have intrinsics for
> I/O."
For GLSL, you have to have some way of reading attributes and
uniforms, exporting to/reading from varyings, etc.
Different GPUs do things differently of course: in some cases, it's a
matter of certain GPRs being initialized by "fixed function" HW with
input values at the start of the shader and certain GPRs being left
with output values at the end of the shader. Other GPUs require
explicit "export" instructions, perhaps just reads/writes to dedicated
I/O registers. Some have a mix (this is the case for HW I've worked
with).
If you have export instructions, or even special I/O registers, I was
thinking that they could be represented or accessed by Target specific
ops -intrinsics. You'd have the GLSL front end generate these
intrinsic operations.
I haven't had to work with register constraints in LLVM, so I'm not
sure what would be best approach if I/O is done through specific GPRs:
you don't want to reserve those registers for I/O only.... it would
take some exploration.
>
> 3) I don't think I get you about the following statements:
>> If you want to support memory operations, your integers need to
>> support the addressing range correctly - you effectively have 17 bits
>> of mantissa - so it may be a tight squeeze without 24 bit integer ops
>> (shifts and ands and stuff will also be a painful, but that's a more
>> expansive topic).
> Can you give some example?
Sorry, I was "thinking out loud".
I made the assumption here that you didn't have 24 bit integer ops,
and that you might try to represent pointers as integers in a single
24 bit float value (maybe with a 1D texture as your addressable
memory). In that case, you'd have a very limited range.
But GLSL doesn't have pointers, so this isn't an issue (and 24 bit
integers gives you a decent addressing range for debugging).
Dan
>
> Really really thanks about your comments.
>
> Wei.
>
> On Nov 20, 10:24 pm, Daniel M Gessel <ges... at apple.com> wrote:
>> This is similar to ATI's R300/R420 pixel shaders. I'm familiar with
>> this hardware, but not really an LLVM expert (working on a code
>> generator myself, but learning as I go).
>>
>> Do you have 24-bit integer operations, or just floating point?
>>
>> What about load/store?
>>
>> Are you looking to run large C programs with complex data structures,
>> or just comparatively simple math functions (i.e. a compute
>> "kernel")?
>>
>> If you only want to support programs that can live entirely within
>> registers, you can custom handle the conversion of the integer/float
>> constants that LLVM spits out and i32/f32 sounds a good place to
>> start
>> - LLVM's mem2reg and inlining is very effective at getting rid the
>> majority of stack operations, and I'd assume you'd have intrinsics
>> for
>> I/O.
>>
>> If you want to support memory operations, your integers need to
>> support the addressing range correctly - you effectively have 17 bits
>> of mantissa - so it may be a tight squeeze without 24 bit integer ops
>> (shifts and ands and stuff will also be a painful, but that's a more
>> expansive topic).
>>
>> Dan
>>
>> On Nov 20, 2008, at 7:46 AM, Wei wrote:
>>
>>
>>
>>> Because each channel contains 24-bit, so.. what is the
>>> llvm::SimpleValueType I should use for each channel?
>>> the current llvm::SimpleValueType contains i1, i8, i16, i32, i64,
>>> f32,
>>> f64, f80, none of them are fit one channel (24-bit).
>>
>>> I think I can use i32 or f32 to represent each 24-bit channel, if
>>> the
>>> runtime result of some machine instructions exceeds 23-bit (1 bit is
>>> for sign), then it is an overflow.
>>> Is it correct to claim that the programmers needs to revise his
>>> program to fix this problem?
>>> Am I right or wrong about this thought?
>>
>>> If there is a chip, whose registers are 24-bit long, and you have to
>>> compile C/C++ programs on it.
>>> How would you represent the following statement?
>>
>>> int a = 3;
>>> (Programmers think sizeof(int) = 4)
>>
>>> Wei.
>>
>>> On Nov 19, 2:01 am, Evan Cheng <evan.ch... at apple.com> wrote:
>>>> Why not model each channel as a separate physical register?
>>
>>>> Evan
>>
>>>> On Nov 17, 2008, at 6:36 AM, Wei wrote:
>>
>>>>> I have a very strange and complicate H/W platform.
>>>>> It has many registers in one format.
>>>>> The register format is:
>>
>>>>> ------------------------------
>>>>> ----------------------------------------------------------------------------------------
>>>>> | 24-bit | 24-bit
>>>>> | 24-bit | 24-
>>>>> bit |
>>>>> ----------------------------------------------------------------------------------------------------------------------
>>>>> a
>>>>> b
>>>>> c d
>>
>>>>> There are 4 channels in a register, and each channel contains 24-
>>>>> bit, hence, there are total 96-bit in 'one' register.
>>>>> You can store a 24-bit integer or a s7.16 floating-point data into
>>>>> each channel.
>>>>> You can name each channel 'a', 'b', 'c', 'd'.
>>
>>>>> Here is an example of the operation in this H/W platform:
>>
>>>>> ADD R3.ab, R1.abab, R2.bbaa
>>
>>>>> it means
>>
>>>>> Add 'abab' channel of R1 and 'bbaa' channel of R2, and
>>>>> put the result into the 'ab' channel of R3.
>>
>>>>> It's complicate.
>>>>> Imagine a non-existed temp register named 'Rt1', the content of
>>>>> its
>>>>> 'a','b','c','d' channel are got from 'a','b','a','b' channel of
>>>>> R1,
>>>>> and imagine another non-existed temp register named 'Rt2', the
>>>>> content of its 'a','b','c','d' channel are got from
>>>>> 'b','b','a','a'
>>>>> channel of R2.
>>>>> and then add Rt1 & Rt2, put the result to R3
>>>>> this means
>>>>> the 'a' channel of R3 will be equal to the 'a' channel of Rt1 plus
>>>>> the 'a' channel of Rt2, (i.e. 'a' from R1 + 'b' from R2, because
>>>>> R1.'a'bab and R2.'b'baa)
>>>>> the 'b' channel of R3 will be equal to the 'b' channel of Rt1 plus
>>>>> the 'b' channel of Rt2, (i.e. 'b' from R1 + 'b' from R2, because
>>>>> R1.a'b'ab and R2.b'b'aa)
>>>>> the 'c' channel of R3 will be untouched, the value of the 'c'
>>>>> channel of Rt1 plus the 'c' channel of Rt2 (i.e. 'a' from R1 + 'a'
>>>>> from R2, because R1.ab'a'b and R2.bb'a'a) will be lost.
>>>>> the 'd' channel of R3 will be untouched, too. The value of the 'd'
>>>>> channel of Rt1 plus the 'd' channel of Rt2 (i.e. 'b' from R1 + 'a'
>>>>> from R2, because R1.aba'b' and R2.bba'a') will be lost, too.
>>
>>>>> I don't know whether I can set the 'type' of such register using a
>>>>> llvm::MVT::SimpleValueType?
>>>>> According the LLVM doc & LLVM source codes, I think
>>>>> llvm::MVT::v8i8,
>>>>> v2f32, etc is used to represent register for SIMD instructions.
>>>>> I don't think the operations in my platform are SIMD instructions.
>>>>> However, I can not find any llvm::MVT::SimpleValueType which can
>>>>> represents a 96-bit register.
>>
>>>>> Thus, my question is:
>>
>>>>> 1) Does current LLVM backend supports this H/W?
>>>>> 2) If yes, how can I write the type of the register class in
>>>>> my .td
>>>>> file?
>>
>>>>> (Which value should I fill in the following 'XXX' ?)
>>>>> def TempRegs : RegisterClass<"MFLEXG", [XXX], 32, [R0, R1, R2, R3,
>>>>> R4, R5, R6, R7, R8, R9,
>>>>> R10, R11, R12,
>>>>> R13, R14, R15, R16, R17, R18, R19,
>>>>> R20, R21, R22,
>>>>> R23, R24, R25, R26, R27, R28, R29,
>>>>> R30, R31]> {
>>>>> }
>>
>>>>> 3) If not, does this means I have to write the whole LLVM backend
>>>>> based on the basic llvm::TargetMachine & llvm::TargetData, just
>>>>> like
>>>>> what CBackend does?
>>
>>>>> --------------------------------------------------------
>>>>> Wei Hu
>>>>> http://www.csie.ntu.edu.tw/~r88052/
>>>>> http://wei-hu-tw.blogspot.com/
>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVM... at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVM... at cs.uiuc.edu http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVM... at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVM... at cs.uiuc.edu http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081122/6fb6e885/attachment.html>
More information about the llvm-dev
mailing list