<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/150973>150973</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[Machine uniformity Analysis] uniformity of a value inside and outside a loop
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
jgu222
</td>
</tr>
</table>
<pre>
There are cases that a value inside a loop is uniform, but outside a loop, it is divergent.
For example,
```
;
; opt -mtriple amdgcn-unknown-amdhsa -passes='print<uniformity>' -disable-output phi_loop_issue.ll
;
define amdgpu_kernel void @test_ctrl_divergence() {
B0:
%tid = call i32 @llvm.amdgcn.workitem.id.x()
br label %B1
B1:
%i = phi i32 [0, %B0], [%i_n, %B1]
%div.cond = icmp slt i32 %tid, %i
%i_n = add i32 %i, 1
br i1 %div.cond, label %B1, label %B2
B2:
%a = add i32 %i, 10
ret void
}
declare i32 @llvm.amdgcn.workitem.id.x() #0
attributes #0 = {nounwind readnone }
```
The result from uniformity analysis:
```
UniformityInfo for function 'test_ctrl_divergence':
CYCLES WITH DIVERGENT EXIT:
depth=1: entries(B1)
TEMPORAL DIVERGENCE LIST:
Value : %i = phi i32 [ 0, %B0 ], [ %i_n, %B1 ]
Used by : %a = add i32 %i, 10
Outside cycle :depth=1: entries(B1)
BLOCK B0
DEFINITIONS
DIVERGENT: %tid = call i32 @llvm.amdgcn.workitem.id.x()
TERMINATORS
br label %B1
END BLOCK
BLOCK B1
DEFINITIONS
%i = phi i32 [ 0, %B0 ], [ %i_n, %B1 ]
DIVERGENT: %div.cond = icmp slt i32 %tid, %i
%i_n = add i32 %i, 1
TERMINATORS
DIVERGENT: br i1 %div.cond, label %B1, label %B2
END BLOCK
BLOCK B2
DEFINITIONS
DIVERGENT: %a = add i32 %i, 10
TERMINATORS
ret void
END BLOCK
```
As the loop (B1) has a divergent condition, the %i would be divergent outside the loop as each lane would exit the loop with different %i. But within the loop, %i is uniform. As %i is the same both inside the loop and outside the loop, there is no way to represent uniformity for %i.
Traditionally, this is resolved by using LCSSA to separate a value inside a loop and one outside the loop. However, there is no Machine function pass LCSSA so far.
The purpose of the issue is to invite the input to see what is a better solution to this issue, would it be better to port Function pass LCSSA into MachineFunction pass ?
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJykVk1zozgQ_TXypcsUiBCSgw_-3HFtJtlKPLO7p5SAxmhGlihJ2ON_vyUBNnG887FLpWIbdb9-3XrqFjOGbyXihCQzkixGrLGV0pMv24ZSOspUcZxsKtQITCPkzKABWzELDPZMNAhcGl4gMBBK1cANNJKXSu8InUPWWFCNHRi4t9w6s4LvUW9R2oCE05XSgN_YrhZI6JyEU3Ibdn_hlMSz9j-o2sJ4ZzWvBQLbFdtcjhv5VaqDHLNdURkG45oZg4bEC0LTWnNpSTzvKHF7JPGS0BTGBTcsEzhWja0bC3XFXx29V25Mg4EQp7AFlly2werm9StqiQL2ihdAbkKLxr7mVovXPpscCb0j9B5I6rxnIYmnJJwCEJpY5xQvIGdCAI-pQxBivwvaTIKD0l-5xV3Ai-BbC-NdMw2CZSgcxixyzMLpLBoAcw9bV7xFTWahK7OzDkmy8N-TmbN7lf1C5BY694Lvg1zJlhzPdzUYYVsoT7rz4adwr9KbsqLorbiziXq2PBrCuqUh_zc_aZcOHaTDrqOHfl2j9eV3fumi9S4wF06dP1dTIDQOW0dmreZZY9H4lz4uSWdSNfLAZQEaWSGVROhCDUUZTjcVgkbTCAulVjs4qwyYZOJouGmzunD9dLJby1JBqTSUjcwtVxIITa-LKm2h5n_PH5Yv8Od68wEW68_L59-WjxtY_rXe9AUssLYViRdOIIDSao6G0DtX-PuO9_LjH0_P04cTwHwJD-uXDuGzP9X940CuSgzOGoOzyOBCZdDK7JPBArLjBeZ39vmpaxr5MRfoHH6clVPRw9P8d5g5gMVytX5cb9ZPjy--Kqda-eD__TRuls8f14_TzdNzi_vmeXdQl48L8KTe8Iuu8Hvz_O96X8v3V075Dw75ZREugv16B7heJ_pT-_gdFX13swaN5CL88KxO3bTDdrb1eoOKGWDnAQYuR-6Or4vszP0GHlQjCshwYNjPwhMkM4Asr0AwiZ0DfuP2bHDgtoKClyVqB-CAA4BZY_0KlyfLfvsGEzgAmJrTS2do2A4hU7bqp_aZiCzeseuycY3VgFRwYEewCjTWGo1jM2h4rol5cl2L0aytCBPi2OJw42A0GiX2bTdoDJdbeJi_vEwdrsGaaWbxX24WnqHEdywD-KAOuEd9Sfcjyys3uk-91d0MunBGQcl0AOdGXje6VgZBlR7Z3wN82RRwuee2Dciluy14sggHdw3iTgoZWosajBKNj2RVn7Bp3IWm21punRw6Y6ugVtrC6go9Lu2J_9t1Eq_aMen1PComcXEf37MRTqI0idPo9iZJR9UkQRoWEd5myS3elJimSZhmUYlRkqa0LLIRn9CQJmFK76KEpnEa3IZRgjHFJLu_u09YQW5C3DEuAt8Tld6OfDKTKAnv03jkz6_xd0ZKJR76VKm7QuqJcxpnzda4psqNNWcYy63wl81-fwYqmvZjM1kMX6vynSYGcm31MWq0mFTW1n7m0hWhqy23VZMFudoRunLxu49xrdUXzC2hK8_aELrq0tpP6D8BAAD__8xNTvU">