[PATCH] D124158: [Clang][Attr] Skip adding noundef attribute to arguments when function has convergent attribute

krishna chaitanya sankisa via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Apr 21 02:54:35 PDT 2022


skc7 created this revision.
skc7 added reviewers: sameerds, cdevadas, ronlieb.
Herald added subscribers: mattd, asavonic, ThomasRaoux, jdoerfert, kerbowa, kbarton, jvesely, nemanjai.
Herald added a project: All.
skc7 requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: llvm-commits, cfe-commits, sstefan1.
Herald added projects: clang, LLVM.

Change https://reviews.llvm.org/D105169 enables noundef attribute by default. This is causing issue with functions tagged with convergent attribute.

For Ex: SimplifyCFG pass removes the branch leading to a BB which has an incoming value that will always trigger undefined behavior. This basically modifies the CFG and combines the basic blocks. This works for CPU execution. But on a GPU, there are intrinsics like "__shfl_sync(unsigned mask, T var, int srcLane, int width=warpSize)", Where the exchange of variable occurs simultaneously for all active threads within the warp. So, here in the cuda/hip kernel, variable var in shuffl_sync may not be initialised, and LLVM IR treats it as undef. Currently all the arguments are tagged with noundef attribute and the above mentioned optimization by SimplifyCFG gets applied and the kernel execution becomes ambiguous. So, the proposed change is to skip adding noundef attribute to arguments when a function has been tagged with convergent attribute.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D124158

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/PowerPC/ppc64le-varargs-f128.c
  clang/test/CodeGenCUDA/address-spaces.cu
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCUDA/cuda-builtin-vars.cu
  clang/test/CodeGenCUDA/kernel-args-alignment.cu
  clang/test/CodeGenCUDA/kernel-args.cu
  clang/test/CodeGenCUDA/redux-builtins.cu
  clang/test/CodeGenCUDA/usual-deallocators.cu
  clang/test/CodeGenCUDA/vtbl.cu
  clang/test/CodeGenCUDASPIRV/kernel-argument.cu
  clang/test/CodeGenOpenCL/addr-space-struct-arg.cl
  clang/test/CodeGenOpenCL/address-spaces.cl
  clang/test/CodeGenOpenCL/amdgcn-automatic-variable.cl
  clang/test/CodeGenOpenCL/amdgpu-abi-struct-coerce.cl
  clang/test/CodeGenOpenCL/amdgpu-call-kernel.cl
  clang/test/CodeGenOpenCL/amdgpu-printf.cl
  clang/test/CodeGenOpenCL/as_type.cl
  clang/test/CodeGenOpenCL/atomic-ops-libcall.cl
  clang/test/CodeGenOpenCL/blocks.cl
  clang/test/CodeGenOpenCL/byval.cl
  clang/test/CodeGenOpenCL/const-str-array-decay.cl
  clang/test/CodeGenOpenCL/constant-addr-space-globals.cl
  clang/test/CodeGenOpenCL/convergent.cl
  clang/test/CodeGenOpenCL/fpmath.cl
  clang/test/CodeGenOpenCL/half.cl
  clang/test/CodeGenOpenCL/kernel-param-alignment.cl
  clang/test/CodeGenOpenCL/kernels-have-spir-cc-by-default.cl
  clang/test/CodeGenOpenCL/no-half.cl
  clang/test/CodeGenOpenCL/overload.cl
  clang/test/CodeGenOpenCL/size_t.cl
  clang/test/CodeGenOpenCL/spir-calling-conv.cl
  clang/test/CodeGenOpenCLCXX/address-space-deduction.clcpp
  clang/test/CodeGenOpenCLCXX/addrspace-of-this.clcpp
  clang/test/CodeGenOpenCLCXX/addrspace-operators.clcpp
  clang/test/CodeGenOpenCLCXX/addrspace-references.clcpp
  clang/test/CodeGenOpenCLCXX/addrspace-with-class.clcpp
  clang/test/CodeGenSYCL/address-space-conversions.cpp
  clang/test/CodeGenSYCL/address-space-mangling.cpp
  clang/test/CodeGenSYCL/functionptr-addrspace.cpp
  clang/test/CodeGenSYCL/unique_stable_name.cpp
  clang/test/OpenMP/amdgcn-attributes.cpp
  clang/test/OpenMP/amdgcn_target_global_constructor.cpp
  clang/test/OpenMP/assumes_include_nvptx.cpp
  clang/test/OpenMP/declare_target_codegen.cpp
  clang/test/OpenMP/declare_target_codegen_globalization.cpp
  clang/test/OpenMP/declare_target_link_codegen.cpp
  clang/test/OpenMP/declare_variant_mixed_codegen.c
  clang/test/OpenMP/distribute_codegen.cpp
  clang/test/OpenMP/distribute_simd_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_firstprivate_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_num_threads_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp
  clang/test/OpenMP/nvptx_target_printf_codegen.c
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
  clang/test/OpenMP/openmp_offload_codegen.cpp
  clang/test/OpenMP/reduction_implicit_map.cpp
  clang/test/OpenMP/target_firstprivate_codegen.cpp
  clang/test/OpenMP/target_parallel_codegen.cpp
  clang/test/OpenMP/target_parallel_debug_codegen.cpp
  clang/test/OpenMP/target_parallel_for_codegen.cpp
  clang/test/OpenMP/target_parallel_for_debug_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_parallel_num_threads_codegen.cpp
  clang/test/OpenMP/target_private_codegen.cpp
  clang/test/OpenMP/target_reduction_codegen.cpp
  clang/test/OpenMP/target_teams_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_firstprivate_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_private_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_private_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_simd_codegen.cpp
  clang/test/OpenMP/target_teams_map_codegen.cpp
  clang/test/OpenMP/target_teams_num_teams_codegen.cpp
  clang/test/OpenMP/target_teams_thread_limit_codegen.cpp
  clang/test/OpenMP/teams_codegen.cpp
  llvm/test/Transforms/SimplifyCFG/tautological-conditional-branch-convergent-noundef.ll



More information about the cfe-commits mailing list