[cfe-dev] SPIR: Answers to the issues raised so far

Ouriel, Boaz boaz.ouriel at intel.com
Thu Sep 27 09:46:59 PDT 2012


Hi All,



Here are answers to the questions / comments that were raised so far during the SPIR discussions.



**** A general clarification regarding sizeof ****
In SPIR, sizeof in an integer constant expression is illegal.
The reason behind it is that once the device is no longer known, the width of the unsigned integer that represents the size_t type is no longer known.
The C spec states in section 6.5.3.4.4 that "The value of the result is implementation-defined, and its type(an unsigned integer type) is size_t, defined in <stddef.h> (and other headers)."
SPIR defines size_t as a unsized unsigned integer and as such cannot be known until the SPIR binary is lowered to the device.
At that time, SPIR is known and as such can be lowered correctly to its type.



*** Hal Finkel: Handling FP_CONTRACT ***



     *****Comment: The current specification provides a mechanism for handling FP_CONTRACT, but does so only at the module level. After much debate,

                                         we have adopted and implemented a way of handling FP_CONTRACT in clang/LLVM through the llvm.fmuladd intrinsic (currently in trunk).

                                         I suggest that the proposed spir.disable.FP CONTRACT metadata be replaced with the current llvm.fmuladd-based mechanism.



                    Answer: We were not aware of this, we will definitely adopt this approach.



*** James Molloy: Why new calling conventions?***



   *** Comment: What are their semantics? And what is their purpose? Why not use metadata instead?

            Answer: The semantics of the new calling conventions are as follows:

           spirfunc calling convention -

                  All arguments are passed as is to the function with the exception of structures, no lowering or expansion is allowed.

                  Structures are passed as a pointer to struct with the byval attribute set.

                  Functions marked with spirfunc calling convention can only be called by another function marked with either spirfunc or spirkrnl calling conventions.

                  Functions marked with spirfunc calling convention can only call functions marked with spirfunc or spirkrnl calling conventions.

                  Functions marked with spirfunc calling convention can only have zero or 1 return values.

                  Functions marked with spirfunc calling convention are only visible to the device.

                  Variable arguments are not allowed, except for printf.

                  This calling convention does not specify the how it is lowered to registers or how the return value is specified.



            spirkrnl calling convention -

                 Inherits the definition of spirfunc calling convention.

                Functions marked with spirkrnl calling convention cannot have any return values.

                Functions marked with spirkrnl calling convention cannot have variable arguments.

                Functions marked with spirkrnl calling convention can be called by the host.

                Functions marked with spirkrnl calling convention are externally visible.





*** James Molloy: Why disallow type conversion for vector types?***



     ***Comment: This might cause some LLVM optimizations to generate an invalid SPIR module which



              Answer: Type conversions in OpenCL between vector types is done via builtin functions and not via implicit conversions, so there is no OpenCL

                                code that can generate these conversions directly(OpenCL spec 6.2.1). In order to be portable, library functions cannot be lowered to their IR equivalent until

                               after the device is known. When OpenCL C was developed, there were many different implementations that supported vector casts, but they were not conversion casts(e.g. Altivec/Cell C extensions).

                               OpenCL requires explicit vector conversion functions, and as such SPIR inherits those rules.

                               Since many compilers already had the vector casts that weren't conversions, the function calls were required so that code could not be written expecting the prior behavior of other compilers in OpenCL.

                               The implication is that a SPIR optimizer will need to rule out such optimizations.



*** Richard Smith, Eli Friedman & Nadav Rotem: Portability Issues ***



         *****comment 1: int does_this_compile[sizeof(void*) - 3];

                      Answer: In this scenario the sizeof(void*) is no longer a frontend compile time constant. This case is not supported, and a compile time error should be raised by a SPIR frontend



        ****comment 2: struct how_do_you_represent_this_in_IR {

                                  int a : 1;

                                  int b : sizeof(void*) * 4;

                        };

                         Answer: Bitfields are disallowed in OpenCL "C"



              ****comment 3:  How do you perform record layout if the size of a pointer is unknown? For instance:



                struct A {

                    int *p;

                    int n;

                } a;

               int arr[offsetof(A, n) - 3]; // or, int arr[(char*)&a.n - (char*)&a.p - 3];



                       Answer:  Since in the current implementation of SPIR, a pointer is defined as 64bits when in a structure(SPIR spec 2.1.5), the offsets themselves are well defined.



        *****comment 4:

                  // We're required to diagnose this iff sizeof(size_t) != 4.

                 extern int x[20];

                 int x[sizeof(size_t) * 5;

               // We're required to diagnose this iff sizeof(size_t) == 4.

                void f(int x) {

                       switch(x) {

                          case 4:

                          case sizeof(size_t):

                                break;

                    }

               }

              Answer:  In this case the expression is no longer a front-end constant expression. This will be a SPIR frontend compile time error.



       *****comment 5: What about this case?

                   enum E {

                         a = sizeof(void*) // is this valid?

                   };

                   Answer: sizeof(void*) is not a frontend compile time constant, so you get a SPIR frontend compile error.



       ****comment 6: What is  the rank of 'size_t'?

                  example: is "sizeof(int) + -8LL < 0" true or false?

                   Answer: The rank of size_t is either uint for 32bit devices and ulong for 64bit devices (int < uint ( == size_t for 32bit) < long  ( == size_t for 64bit) < ulong)

                                     This means that the only ambiguity is when comparing it to long. Whenever this ambiguity is encountered a frontend compilation error should occur.



       ****comment 7: Why can't we always make size_t 64 bits wide?

                                          If we ignore the issue of size_t inside structs,  I don't see the problem with deciding that size_t is 64bits, even on 32bit systems.

                                           The only place that I saw that size_t was used, in user code, is in the get_global_id() family of functions (and other APIs which require offsets).

                                           A target-specific compiler optimization can reduce the bit width of the get_global_id (and friends) back to 32bits and propagate this, if needed.

                  Answer:  First and most importantly, OpenCL embedded profile doesn't require support for 64 bit integers (it is optional).

                                     Making all size_t and ptrdiff_t 64 bit would disallow the usage of SPIR in some embedded systems.

                                     Secondly, I'm not sure target specific optimizations will be able to guess the bit width correctly.

                                     Provided a kernel which uses a buffer and accesses elements from get_global_id(0)*8 to get_global_id(0)*8+7,

                                     it won't be sufficient to assume that get_global_id(0) returns a 32bit value, since this makes get_global_id(0)*8 is a 35bit value, and it would disable these optimizations.

                                     Assuming get_global_id(0) returns less than 32 bits seems to me wrong too. Not being able to optimize these size_t values in 32 bit architectures would cause huge performance degradation,

                                    since emulation of 64 bit operations in 32 bit architectures would be quite painful



I appreciate all of the good feedback,
Boaz
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120927/f3e574e7/attachment.html>


More information about the cfe-dev mailing list