[PATCH] D15914: [OpenCL] Pipe builtin functions

Mon Jan 11 18:42:02 PST 2016

pxli168 added inline comments.

================
Comment at: include/clang/Basic/Builtins.def:1260
@@ +1259,3 @@
+
+LANGBUILTIN(reserve_read_pipe, "i.", "tn", OCLC_LANG)
+LANGBUILTIN(reserve_write_pipe, "i.", "tn", OCLC_LANG)
----------------
Anastasia wrote:
> Ok, this way it means variable number of arg which isn't right either. But I see some other builtins do the same.
> 
> I think a better approach would be to add pipe as a modifier and a way to represent any gentype in builtins declarations here to be able to specify a signature properly. Although it seems it won't be used for anything but documentation purposes.
> 
> Also adding the overloaded variant for read/write_pipes explicitly might be a good idea (similar to sync_fetch_and_add).
> 
> But I see that not all builtins follow this approach, so it's up to you.
I think this maybe OK for a def file, the parameter is strictly checked in Sema. 

================
Comment at: lib/CodeGen/CGBuiltin.cpp:2033
@@ +2032,3 @@
+          *Arg1 = EmitScalarExpr(E->getArg(1));
+    llvm::Type *ReservedIDTy = ConvertType(getContext().OCLReserveIDTy);
+
----------------
pekka.jaaskelainen wrote:
> Anastasia wrote:
> > pekka.jaaskelainen wrote:
> > > Anastasia wrote:
> > > > Why do we need to modify user defined functions? Only builtins will have this extra parameter.
> > > > 
> > > > I think packet size would be useful for builtin implementation to know the number of bytes to be read/written. I don't see how to implement it correctly otherwise. As mentioned earlier relying on the metadata is not a conventional compilation approach and should be avoided if possible.
> > > The pipe struct can have the packet size in its header before the actual ring buffer or whatever, which can be used as a fallback unless the compiler can optimize it to a larger access. Correct implementation thus doesn't require a "hidden parameter". Adding it as a compile time hidden constant argument should help the optimizers, that's of course true, but I don't think it's strictly required.
> > > 
> > > If you think having a special behavior for the built-ins calls isn't problematic, then fine, I'm not having so strong opinion on this.
> > So where would this size be initialized/set? Note that host code might have different type sizes.
> > 
> > I am thinking that having a size parameter makes the generated function more generic since we lose information about the type here and recovering it might involve extra steps. I am currently not clear about how to do that.
> https://www.khronos.org/registry/cl/sdk/2.0/docs/man/xhtml/clCreatePipe.html sets the packet size in creation. We have a prototype implementation working using that. But that generic software version uses a byte loop which is painfully slow, which of course should be optimized to wide loads/stores whenever possible.
> 
> If we want to optimize it, it means inlining the builtin call and then deriving the size for the loop at the compile time after which it can be vectorized (assuming the packet fifo is aligned). Where this size constant comes from is the question under discussion -- you propose having it as an additional hidden parameter to the calls and I propose a metadata. Yours optimizes automatically but requires special casing (when calling a builtin, not user function, add this magic parameter), mine requires a new optimization that takes the MD in account and converts it to a constant after inlining the built-in.
> 
> I'm open for both as I do not really know how much trouble the special casing will bring and I appreciate that optimizations might just work, but in my understanding the size info is strictly not required, but very useful for optimization.
> I am thinking that having a size parameter makes the generated function more generic since we lose information about the type here and recovering it might involve extra steps. I am currently not clear about how to do that.

I know what Anastasia is worried about, but actually I should say that you can simply create a pass for the pipe function that get this information for the metadata, you can use MDNode to get these information pretty easy. That is what I am doing now.

================
Comment at: lib/Sema/SemaChecking.cpp:291
@@ +290,3 @@
+  bool isValid = true;
+  // TODO: For all pipe built-in read is for read_only?
+  bool ReadOnly = getFunctionName(Call).find("read") != StringRef::npos;
----------------
Anastasia wrote:
> I agree the spec doesn't require that checking but I think it's just being imprecise in the description. If you are in doubt you can raise a bug with Khronos to clarify. This might result in additional latency.
> 
> I think it makes sense though to check all of them.
I will keep it here then, and we can change it once the Khronos have some feeback.

================
Comment at: test/SemaOpenCL/invalid-pipe-builtin-cl2.0.cl:4
@@ +3,3 @@
+void test(read_only pipe int p, global int* ptr){
+  int tmp;
+  read_pipe(tmp, p);    // expected-error {{first argument to read_pipe must be a pipe type}}
----------------
Anastasia wrote:
> I think we need to test all the builtins!
Should we test the negative case for all of the builtin? There are for group of cases and I think maybe just 4 groups with read/write that touches all of the cases maybe enough.

http://reviews.llvm.org/D15914