[cfe-dev] Completing the OpenCL Vector Extensions

Mon Jan 14 11:38:33 PST 2013

Hi Michael,
Why not to implement something like the __builtin_astype - a builtin that gets a type parameter.

It could be something like:

double4 convert_double4(float4 v) {
  return __builtin_vector_convert(v, double4);
}

or maybe even use a preprocessor macro:

#define convert_double4(V) __builtin_vector_convert((V), double4)

Thanks
    Guy

-----Original Message-----
From: cfe-dev-bounces at cs.uiuc.edu [mailto:cfe-dev-bounces at cs.uiuc.edu] On Behalf Of Michael Gottesman
Sent: Saturday, January 12, 2013 07:15
To: Clang Dev
Subject: [cfe-dev] Completing the OpenCL Vector Extensions

This email is just a request for ideas/thoughts (i.e. lower than a request for comment).

Something that has been on my interest list for a while has been to complete the Clang OpenCL Vector Extensions by creating target independent vector conversions for the ``usual'' types (i.e. not 8 bit float vectors but rather float -> double, uint32x2 -> uint64x2, etc).

A few ideas have come up in discussions with various people.

1. Put the vector types as .ll files into compiler-rt in some manner. This option will suck in terms of performance though. We could use LTO to perform inlining but compiler-rt in many instances is a dylib so we would really need a separate compiler-rt for .ll code. Then we would get nice performance and would not need to extend the compiler itself.

2. Create a builtin like __builtin_shuffle_vector but for conversions wrapped in a header. This idea has two different implementations: 2a. the type approach and 2b. the flag approach.
2a. The type approach. We already know the types of the two vectors so why not just assuming the number of lanes are compatible and that the conversion is valid for the underlying scalar types output the conversion in IR. We could also potential put in a check if given the current platform supports the given conversion or just say use the header you are not supposed to use this directly. The header declarations would look like this (I just made up the function names):

float64x4 ConvertFloatToDouble(float32x4 v) {
	float64x4 result;
	__builtin_vector_convert(v, &result);
	return result;
}

2b. The flag approach. The flag approach is to follow along the lines of __builtin_shuffle_vector and just encode the various conversion operations in an integer using the first x bits to encode the operation, the second x bits to encode the type of the input, and the third x bits to encode the type out the output.

float64x4 ConvertFloatToDouble(float32x4 v) {
	return __builtin_vector_convert(v, CONSTANT); }

Any feedback/comments/pitchforks are welcome = ).

Michael
_______________________________________________
cfe-dev mailing list
cfe-dev at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.