[PATCH] D158603: [AMDGPU][TargetMachine] Handle case when +extended-image-insts is set, and the user forces +wave64

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 23 06:33:22 PDT 2023


arsenm requested changes to this revision.
arsenm added a comment.
This revision now requires changes to proceed.

I think changing the behavior of the global subtarget is a bad idea. We shouldn't be relying on the global subtarget in the first place, much less selectively changing the interpretation from "set default" to "append".

Using one set of IR functions and pretending they work for both wave sizes is a bad idea and simply not working. You're running into the consequences of several layers of hacks in device libs, comgr and clang to pretend this works. The wave size behaves more like a different target-cpu, rather than a feature you can simply union. I increasingly think we should move the wavefront size out from target-features and into a separate attribute (or introduce a set of wave32 calling conventions).

There may be several bugs here, and I don't think any of them should be addressed in the backend. First, last I knew comgr was not using clang/-mlink-builtin-bitcode (although maybe this was fixed?). It doesn't matter because I just performed an experiment and found that's also broken:

  // builtins.cl 
  // clang -O3 -target amdgcn-amd-amdhsa -c -emit-llvm -o builtin.bc builtin.cl
  
  __attribute__((target("extended-image-insts")))
  void uses_images(void) {
  
  }
  
  void uses_nothing(void) {
  
  }
  
  // clang -target amdgcn-amd-amdhsa -mcpu=gfx1031 -mwavefrontsize64 -S -emit-llvm -Xclang -mlink-builtin-bitcode -Xclang builtin.bc  -o - builtin-user.cl  -O0
  // builtin-user.cl
  extern void uses_images(void);
  extern void uses_nothing(void);
  
  void user_images(void) {
      uses_images();
  }
  
  void user_nothing(void) {
      uses_nothing();
  }

This is broken, because the imported builtin didn't append the feature to uses_images:

  define internal void @uses_images() #1 {
    ret void
  }
  
  ; Function Attrs: convergent mustprogress nofree norecurse nosync nounwind willreturn memory(none)
  define internal void @uses_nothing() #2 {
    ret void
  }
  
  attributes #1 = { convergent mustprogress nofree norecurse nosync nounwind willreturn memory(none) "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="gfx1031" "target-features"="+extended-image-insts" }
  attributes #2 = { convergent mustprogress nofree norecurse nosync nounwind willreturn memory(none) "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="gfx1031" }

@uses_nothing isn't even getting the +wavefrontsize64, which is doubly broken. I believe this was working correctly at one point, so did this regress? clang is broken here, so that should be fixed first. If comgr is still not correctly using -mlink-builtin-bitcode, it should implement an approximately equivalent hack to append the features if there are still blockers to doing so.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:585-587
+    bool EnableExtendedImageInstsForFunction =
+        FunctionFS == "+extended-image-insts" &&
+        !TargetFS.contains("-extended-image-insts");
----------------
There's nothing special about extended-image-insts to warrant special casing it here. Don't really understand how it connects to the original failure


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h:34
   StringRef getGPUName(const Function &F) const;
-  StringRef getFeatureString(const Function &F) const;
+  std::string getFeatureString(const Function &F) const;
 
----------------
Feature string shouldn't require adjustment


================
Comment at: llvm/test/CodeGen/AMDGPU/extended-image-insts-wave32-wave64.ll:2
+; RUN: llc -mtriple=amdgcn--amdhsa -mcpu=gfx1010 -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,WAVE32 %s
+; RUN: llc -mtriple=amdgcn--amdhsa -mcpu=gfx1010 -mattr="+wavefrontsize32" -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,WAVE32 %s
+; RUN: llc -mtriple=amdgcn--amdhsa -mcpu=gfx1010 -mattr="+wavefrontsize64" -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,WAVE64 %s
----------------
Don't need quotes or -verify-machineinstrs


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158603/new/

https://reviews.llvm.org/D158603



More information about the llvm-commits mailing list