<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/57188>57188</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[NVPTX] st.global.u32 become st.u32 after update from llvm10 to llvm15
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
python3kgae
</td>
</tr>
</table>
<pre>
For llvm ir like this
```
target triple = "nvptx64-nvidia-cuda"
; Function Attrs: nounwind
define void @foo(i64* nocapture readonly byval(i64) %0, i64* nocapture readonly byval(i64) %itop) local_unnamed_addr #0 {
entry:
%1 = bitcast i64* %0 to float**
%2 = load float*, float** %1, align 8
store float 1.000000e+00, float* %2, align 4
%3 = load i64, i64* %itop, align 8
%4 = inttoptr i64 %3 to float*
store float 2.000000e+00, float* %4, align 4
ret void
}
attributes #0 = { nounwind }
!llvm.linker.options = !{!0, !1, !2, !3, !4}
!llvm.module.flags = !{!5, !6}
!llvm.ident = !{!7}
!nvvmir.version = !{!8}
!nvvm.annotations = !{!9, !10, !9, !11, !11, !11, !11, !12, !12, !11, !13, !14, !15}
!0 = !{!"/FAILIFMISMATCH:\22_MSC_VER=1900\22"}
!1 = !{!"/FAILIFMISMATCH:\22_ITERATOR_DEBUG_LEVEL=0\22"}
!2 = !{!"/FAILIFMISMATCH:\22RuntimeLibrary=MT_StaticRelease\22"}
!3 = !{!"/DEFAULTLIB:libcpmt.lib"}
!4 = !{!"/FAILIFMISMATCH:\22_CRT_STDIO_ISO_WIDE_SPECIFIERS=0\22"}
!5 = !{i32 1, !"wchar_size", i32 2}
!6 = !{i32 7, !"PIC Level", i32 2}
!7 = !{!"clang version 10.0.0 "}
!8 = !{i32 1, i32 4}
!9 = !{null, !"align", i32 8}
!10 = !{null, !"align", i32 8, !"align", i32 65544, !"align", i32 131080}
!11 = !{null, !"align", i32 16}
!12 = !{null, !"align", i32 16, !"align", i32 65552, !"align", i32 131088}
!13 = !{void (i64*, i64*)* @foo, !"kernel", i32 1}
!14 = !{void (i64*, i64*)* @foo, !"maxntidx", i32 1}
!15 = !{void (i64*, i64*)* @foo, !"minctasm", i32 2}
```
Instruction combine in llvm10 will transform
```
%3 = load i64, i64* %itop, align 8
%4 = inttoptr i64 %3 to float*
store float 2.000000e+00, float* %4, align 4
```
into
```
%3 = bitcast i64* %itop to float**
%4 = load float*, float** %3, align 8
store float 2.000000e+00, float* %4, align 4
```
And the final output ptx will use st.global.u32 for the store.
After update to llvm15, instruction combine will not do the transform anymore.
As a result, final output ptx will use st.u32 for the store.
Is this change expected?
Thanks
Xiang
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzVV8ty6jgQ_RqzUcXlN7BgwXPGVWRyC7iZu6NkW4AmskxJMknm66clbLB55DGzmsRl2Vb36dPSkdQkRfY-mBUCMXbIEYWWvhCkdlRazsRyhtU9cqrLvCostkQhJeieEWT5E2R5Hj_s1VsUPPADzSh-SMsMw9cWij9Cs5KnihYcDZUS0vKHiBclf6U8O9pkZEM5QYeCZsgKnE1RWF6PRoHlacsU71UpCBIEZwVn7yh5P2BWW_SBRuhY3hh9x4GqYq8fGRizdck5zkm2xlkmoNd3kNUdHakRrsQ7MD6-Ie3rmuQTqlIsVR1Vk0CqQBtWYAUf9HV28YwLdGUNg3HT2ADrb5jRLUe92lmqAjIxhsi1HfNHLG_kOE0AE-PsHTRC--fQhuq4wfg4ClcxoScwXpQrsFBCuxyxmhneYOh9yDBoMIQolb8AUemZr-TSnbTUY-4YZEOTUhFZzY4WX3d0UhG69PJcrWybUf5ChA0pgPhkpVlXz63nGnbQulXrVa1ftcEZs0LLi6xkxN4wvL3ACiuf6MqHZiCgtnG3acQPh5wK-0CE1OujZdi7NLQx54XCN5Lp18nUWZ0-uF948K4eTl31aLhB_RBej7XTJqM3AG82G8bzePYYLx-Hq_HvegWFY89bPy7H6-fpAhzcPihEf9P2jUzdb6DFq-liuHparCfT0c_f1vPp83QO3jdxva_jLkquaE7mNBFYr_7J42q91AOfLggjWJJb-P4t_Ml0Nvw5X83jEWAzmqT7XIEskwvf4Bs5jxdAZjWJn9bx8mn9ZzyZrpc_puN4Fk8Xy3vJh40A1PdQPcFg9prusFhL-jcxYWF_gP6Wc3Th3D07_4jHaE4OhN3z7V5lljLMt6hWvOvY8I8u6PZu0dUPrVXZb1jxkrEzLbPJNCi1llJLr5_53e2KwjAI7ne7vuv0nFZY98th2xtJS7mfOn5EOPQ-Idwep6aij2dzfSqfjxE4RM3uXp3aNTpsu7wlCreFHPx75By_weLM3u5ih_8Bm0KdgmV-W8ztcuh4j7lUojwWN2mRJ7qModzUVSCzV8oYFEyYy00h8psw_4cz-oIyRCw-y-W6OtJ53C-Qgi8VSP4nBdLXU_tgSodQT6gdQFKOGSpKtS8Vgir3OJulJBDS3rIiwcwuQSEwtcbeELFbSBtFBCr3GVZEp25UYWoFekM2Bh7Od5QVBu-kG4T5e37GHkqEoWiSJVMmw49ofswvlqbmR3AA8C1B5G1PUkUyy581rVbQ-1L9MPhFwbJDBm4UOV3Ph62pkw38rO_3cUdRxcjACkd_PP9Y_bLCycU4JQQyPbHCzbHZiCKv18xpnDqlYIOdUnv9gwGOQ7i2VO3KxAYceNFWVfOwF8VfwB1eqZQlkfAQdt1er7Mb-AQ7m8BPw4iE6SbtZm6UOAH20ijqu73A7zCcECY1ceDcoQPP8Tyn53adbtjzHRsHUZ_0QwDrOr0ocGHPIDmmzDb1XSG2HTEwHJJyK6GTUankuRNLCYIjpMbHpdoVYrB_h4b7L1tMOobywPD9B1SqmGQ">