[PATCH] D97869: [OpenCL][Draft] Add OpenCL builtin test generator

Wed Mar 10 08:35:08 PST 2021

svenvh added a comment.

In D97869#2616772 <https://reviews.llvm.org/D97869#2616772>, @Anastasia wrote:

> Regarding 2 and 4 I think we should drive towards deprecation of `opencl-c.h` as it is a maintenance overhead but we could convert it into a test instead?

Perhaps you misunderstood option 4, as that does not require maintaining `opencl-c.h`.  Only for option 2 we would have to maintain `opencl-c.h`.  For option 4, the checked in files could look something like this:

name=clang/test/SemaOpenCL/builtins-opencl2.0.check
  ; RUN: clang-tblgen -emit-opencl-builtin-count -cl-std=cl2.0 OpenCLBuiltins.td | FileCheck '%s'

  ; This file lists the number of builtin function declarations per builtin function name
  ; available in the OpenCL 2.0 language mode.

  CHECK: absdiff 48
  CHECK: abs 48
  CHECK: acos 18
  CHECK: acosh 18
  CHECK: acospi 18
  ...

To illustrate the use, suppose a developer writes a patch that accidentally makes the `double` variants of `acos` unavailable in OpenCL 2.0.  The reported number of `acos` builtins would be different so FileCheck would fail on the line containing `acos 18`.  If we use option 4 in combination with option 1, a Clang developer can regenerate the test and observe the differences for the `acos` tests: from there it becomes obvious that the double variants of `acos` disappeared.

> Aside from option 3 all other options will require adding non-standard testing formats in Clang. This means that if there is any core refactoring changes even outside of OpenCL affecting the functionality in the headers, it would be required to learn how we do the testing in order to address any breakages. Perhaps this can be mitigated by the documentation, but overall it is an extra burden to the community. Speaking from my personal experience I was in the situations breaking functionality outside of OpenCL and having a unified testing format is a big advantage in understanding and addressing the issues in unfamiliar code.

Do you expect this to be a problem in practice for the other proposals, and if so could you elaborate in more detail about any particular concerns for the other proposals?

> What problems do you see with checking in autogenerated files? There are some existing tests that check similar functionality and they seem to contain repetitive patterns. Honestly, I wouldn't be able to say if they were written with copy/paste or auto-generated but I am not sure whether it is very important. Of course, it would have been better if we have built up the functionality incrementally, and it is still an option. But it hasn't been possible in the past and as a matter of fact the original header was committed as one large file (around 15K lines).
>
>>   Updates to OpenCLBuiltins.td will give large diffs when regenerating the tests (regardless of whether the file is split or not).
>
> Could you provide more details. I don't follow this.

There are two problems here that combine to make the problem bigger: checking in autogenerated files, and the fact that those files are large.  The problem I anticipate with regenerating the test files (at least with the current generator), is that inserting a new builtin in the .td file results in renumbering of the test functions, which results in a large diff.

Other problems I see with checking in large autogenerated files:

- Merge conflicts in the generated files when reordering/rebasing/porting patches.  A single conflict in the .td description could expand to many conflicts in the autogenerated file(s).
- Diffs in the autogenerated files clutter the output of for example `git log -p`.
- Code reviews of changes to those files are harder due to the size.
- It bloats the repository.

Disclaimer: I may be biased by my previous experience on projects that had large autogenerated files checked in, that's why I feel that we should avoid doing so if we reasonably can.  I have probably revealed now that option 3 is my least favorite option. :-)

>>   Manually verifying that the generated test is actually correct is not trivial.
>
> I think the way to approach this would be to just assume that the test is correct and then fix any issues we discover as we go along with the development. There is nothing wrong with this though as even small manually written tests have bugs.

Fair enough.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97869/new/

https://reviews.llvm.org/D97869