[PATCH] D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets

Thu Aug 10 03:39:04 PDT 2017

arphaman added a comment.

1. I'm sorry, but I had to revert r310489 and follow-up commits r310505, r310519, r310537 and r310549 since it looks like the failures are accumulating. The revert commit was r310580. The following run lines were failing for me because of various assertion failures and file check errors:

  /// ###########################################################################

  /// Check cubin file generation and usage by nvlink
  // RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %s 2>&1 \
  // RUN:   | FileCheck -check-prefix=CHK-CUBIN %s

  // CHK-CUBIN: clang{{.*}}" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda.s"
  // CHK-CUBIN-NEXT: ptxas{{.*}}" "--output-file" "{{.*}}-openmp-nvptx64-nvidia-cuda.cubin" "{{.*}}-openmp-nvptx64-nvidia-cuda.s"
  // CHK-CUBIN-NEXT: nvlink" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda" {{.*}} "openmp-offload-openmp-nvptx64-nvidia-cuda.cubin"

  /// ###########################################################################

  /// Check cubin file generation and usage by nvlink when toolchain has BindArchAction
  // RUN:   %clang -### -no-canonical-prefixes -target x86_64-apple-darwin17.0.0 -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %s 2>&1 \
  // RUN:   | FileCheck -check-prefix=CHK-CUBIN-DARWIN %s

  // CHK-CUBIN-DARWIN: clang{{.*}}" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda.s"
  // CHK-CUBIN-DARWIN-NEXT: ptxas{{.*}}" "--output-file" "{{.*}}-openmp-nvptx64-nvidia-cuda.cubin" "{{.*}}-openmp-nvptx64-nvidia-cuda.s"
  // CHK-CUBIN-DARWIN-NEXT: nvlink" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda" {{.*}} "openmp-offload-openmp-nvptx64-nvidia-cuda.cubin"

  /// ###########################################################################

  /// Check cubin file generation and usage by nvlink
  // RUN:   touch %t1.o
  // RUN:   touch %t2.o
  // RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda %t1.o %t2.o 2>&1 \
  // RUN:   | FileCheck -check-prefix=CHK-TWOCUBIN %s

  // CHK-TWOCUBIN: nvlink"{{.*}}"openmp-offload-{{.*}}.cubin" "openmp-offload-{{.*}}.cubin"

  /// ###########################################################################

  /// Check cubin file generation and usage by nvlink when toolchain has BindArchAction
  // RUN:   touch %t1.o
  // RUN:   touch %t2.o
  // RUN:   %clang -### -no-canonical-prefixes -target x86_64-apple-darwin17.0.0 -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda %t1.o %t2.o 2>&1 \
  // RUN:   | FileCheck -check-prefix=CHK-TWOCUBIN-DARWIN %s

  // CHK-TWOCUBIN-DARWIN: nvlink"{{.*}}"openmp-offload-{{.*}}.cubin" "openmp-offload-{{.*}}.cubin"

  /// ###########################################################################

  /// Check PTXAS is passed -c flag when offloading to an NVIDIA device using OpenMP.
  // RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda %s 2>&1 \
  // RUN:   | FileCheck -check-prefix=CHK-PTXAS-DEFAULT %s

  // CHK-PTXAS-DEFAULT: ptxas{{.*}}" "-c"

  /// ###########################################################################

  /// PTXAS is passed -c flag by default when offloading to an NVIDIA device using OpenMP - disable it.
  // RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -fnoopenmp-relocatable-target %s 2>&1 \
  // RUN:   | FileCheck -check-prefix=CHK-PTXAS-NORELO %s

  // CHK-PTXAS-NORELO-NOT: ptxas{{.*}}" "-c"

  /// ###########################################################################

  /// PTXAS is passed -c flag by default when offloading to an NVIDIA device using OpenMP
  /// Check that the flag is passed when -fopenmp-relocatable-target is used.
  // RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -fopenmp-relocatable-target %s 2>&1 \
  // RUN:   | FileCheck -check-prefix=CHK-PTXAS-RELO %s

  // CHK-PTXAS-RELO: ptxas{{.*}}" "-c"

  /// ###########################################################################

  /// Check PTXAS is passed the compute capability passed to the driver.
  // RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda --fopenmp-ptx=+ptx52 %s 2>&1 \
  // RUN:   | FileCheck -check-prefix=CHK-PTXAS-VERSION %s

  // CHK-PTXAS-VERSION: clang{{.*}}.bc" {{.*}}"-target-feature" "+ptx52"

  /// ###########################################################################

  /// Check PTXAS is passed the compute capability passed to the driver.
  // RUN:   %clang -### -no-canonical-prefixes -target x86_64-apple-darwin17.0.0 -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda --fopenmp-ptx=+ptx52 %s 2>&1 \
  // RUN:   | FileCheck -check-prefix=CHK-PTXAS-DARWIN-VERSION %s

  // CHK-PTXAS-DARWIN-VERSION: clang{{.*}}.bc" {{.*}}"-target-feature" "+ptx52"

2. I think that this test is starting to get a bit big, which makes it harder to figure out what exactly is failing. Can you please use new test files in the future patches?

3. Can you please figure out why your commit emails don't make it to the cfe-commits.llvm.org mailing list? It's easier to follow the situation with the commit emails.

Let me know if you need help figuring out the failures,
Alex

Repository:
  rL LLVM

https://reviews.llvm.org/D29660