[PATCH] D141054: [NVPTX] Set default version of architecture to SM_30, PTX to 6.0.

Fri Jan 6 14:24:02 PST 2023

asavonic added inline comments.

================
Comment at: llvm/test/CodeGen/NVPTX/surf-tex.py:2
 # RUN: %python %s --target=cuda --tests=suld,sust,tex,tld4 --gen-list=%t.list > %t-cuda.ll
-# RUN: llc %t-cuda.ll -verify-machineinstrs -o - | FileCheck %t-cuda.ll
-# RUN: %if ptxas %{ llc %t-cuda.ll -verify-machineinstrs -o - | %ptxas-verify %}
+# RUN: llc -mcpu=sm_20 %t-cuda.ll -verify-machineinstrs -o - | FileCheck %t-cuda.ll
+# RUN: %if ptxas %{ llc -mcpu=sm_20 %t-cuda.ll -verify-machineinstrs -o - | %ptxas-verify %}
----------------
tra wrote:
> pavelkopyl wrote:
> > tra wrote:
> > > We may as well change it to `sm_30`, too. sm_20 is gone for all practical purposes. Even sm_30 is on the way out.
> > Changing version to sm_30 also changes the generated code. The reason is that sm_20 (Fermi) has no support of image handles, as a result nvptx backed runs special pass NVPTXReplaceImageHandles to workaround this.
> > Textually such a code diff looks following:
> > 
> > 
> > ```
> > .global .surfref gsurf;
> > ```
> > sm_20:
> > 
> > ```
> > suld.b.1d.b8.trap {%rs1}, [gsurf, {%r1}];
> > ```
> > 
> > sm_30:
> > ```
> > mov.u64 %rd3, gsurf;
> > suld.b.1d.b8.trap {%rs1}, [%rd3, {%r1}];
> > ```
> > 
> > We can change this test to support sm_30. But I think it's better to do this in another review. Is that OK?
> OK. We can keep sm_20 test around for now. 
> The reason is that sm_20 (Fermi) has no support of image handles, as a result nvptx backed runs special pass NVPTXReplaceImageHandles to workaround this.

Can we run NVPTXReplaceImageHandles for other targets as well? These extra `mov` instructions are unnecessary, and the pass can eliminate them. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141054/new/

https://reviews.llvm.org/D141054