<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/58176>58176</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            propagating range metadata for bool cause regression in PyTorch unit tests on ROCm
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          yxsamliu
      </td>
    </tr>
</table>

<pre>
    8018d6be3459780e81a5da128a9915eb27909902 triggered some regressions in PyTorch unit tests on ROCm.

To build PyTorch on ROCm please follow https://github.com/ROCmSoftwarePlatform/pytorch/wiki/Building-PyTorch-for-ROCm

To reproduce the issue:

cd /var/lib/jenkins/pytorch/test

PYTORCH_TEST_WITH_ROCM=1 /opt/conda/bin/python3.7 test_cuda.py TestCuda.test_cusparse_multiple_threads_same_device -v

PYTORCH_TEST_WITH_ROCM=1 /opt/conda/bin/python3.7 test_ops.py TestCommonCUDA.test_non_standard_bool_values_to_sparse_cuda_bool -v

Observed Behaviour :
======================

test_cuda failed! Received signal: SIGIOT
test_native_mha failed! Received signal: SIGIOT
test_nestedtensor failed! Received signal: SIGIOT
test_ops failed!
test_shape_ops failed! Received signal: SIGKILL
test_testing failed!
test_type_hints failed!
================================================================================
test_cudart_register (__main__.TestCuda) ... ok
test_cudnn_allow_tf32_get_set (__main__.TestCuda) ... ok
test_cudnn_multiple_threads_same_device (__main__.TestCuda) ... skipped "test doesn't currently work on the ROCm stack"
test_current_stream (__main__.TestCuda) ... skipped 'detected only one GPU'
test_cusparse_multiple_threads_same_device (__main__.TestCuda) ... Memory exception on virtual address 0x7f37d5df2000, node id 1 : Page not present
Address does not belong to a known buffer
Memory access fault by GPU node-1 (Agent handle: 0x55d921151270) on address 0x7f37d5df2000. Reason: Page not present or supervisor privilege.
test_cuda failed! Received signal: SIGIOT
====================================================================== 
FAIL: test_non_standard_bool_values_to_sparse_cuda_bool (__main__.TestCommonCUDA)
----------------------------------------------------------------------

8018d6be3459780e81a5da128a9915eb27909902 itself seems correct. It just triggered some other middle-end or backend issues.

It seems the ISA change causing the regression was due to DAG combiner combines zext when its operand is a bool:

https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L12318 

Disabling combining zext will get the test pass. However, this not necessarily mean DAG combiner is incorrect. It is also possible that some other optimizations cause bool values saved to bool pointers incorrectly, which causes the bool value later loaded are already wrong.

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzVV99v4jgQ_mvgZUSUHw2QBx4o3HbRddVqy-p0T5ETD8SLY0e2A2X_-hsnbAvdvWp7V-l0KHISxzPzjT3zzVBofpxNw2jKxwUmV2k2mYY4jVjKWRRPWZZFKRbxJAuzLIzBGbHdokEOVtcIBrcGrRVaWRAK7o9rbcoKWiUcOLTOglbw-W5RB4NwOQjn_bjWULRC8qf1p0XQSGQWYaOl1AeonGvsIJkP4g90bYWr2iIodU0vfvWD3rgDM3gvmdto46ebo_P66OkgdoJu196MUNvRydKIFo688As4BhujeVsiuApBWNuiN3y2qORA-vbM0ChFQeNXVDuh7IVZ7_O51P2f67vPi4_5-reHdf7Hav0xJ-OfBsky8tp042gsteKM7oVQva5KqySYdPuXly1nQXOENb0s_PNp1jbMWMzrVjpBu5a7yiDjNresxpzjXpAro_27QtGNfUKi61qrxZflvMejtMqtYyRseF5oLfM9ky3a3On8hNQ70n16AeuusGj2FE_XWLG90K2B551Plv_2OrP0tJ-wYUIiH8QRfMYShbduxVYxSZbhYXWzulufiSjmaEleV28WpAG5Q2W1eaMo7fWzxNm8rViDl1__Rt_vq9vbM0E_UB78VKk7ks5KKPeDzXc4gP_XdRkoxuVEcIJOkYIynuZ5zYTK8-B7Mg7iDIIgAL27FFQqZ57BcrdJ4nyLdG7o3qrh1dR-TZfdiaZBT1exVwdco6V0njgoW2NQOXmEgzY7z7qe7TrmpewtdyRxAaNbTYlN9utfNDnh6LCksCftZEcrhJv7LzR_ofhX2Os1e5-w1uYI-Fhi46j4eFf2wriWSWCc-5IE4eNkk0x4yjdxGIaDeAFKc-J2DpFnGLhnW6QpBw0tJz97gPOTtN-07muBUlPeOA0MdkofFJWuzQZNv_wEhJWlF9owcgiKo3e5szby3Dqdb0k9VESP0lcVQpamPIujKI3iSeh9Ivg_hx1QdjOr1c8AA7GKbRtiT-EJpjFiT8m7xeAf091_n3_vfUHv2If56tY7-_Zi9UMUPlU-Orde-ehdfueV6pe7MeEsyg1YxNpCqSlfSxfAysHXlhL_Ra-mKdkN1IJTGI5QcR8_BWW9f-w6HnvRpJGWXq_niNXDHEqKYIrAkrXWVxI__dz-wYFR0rToE2U5vyEwNTURZO_0YOEbPjo4VKg8atAUt6wzTHnld_pFu_VK7yfl_vttRF3bV_LZtyxS-6bMH9XZmq5RW1Am3qCffkBJqwkuQaRXGhcnnEHZNIM4uY3iJJrCOZKlsKyQ3uPeFf_U-yKkBCL3bic6pm2YtQF81AfcE0EQ47hK9Cyi0BMEM4I4sUamLvdI-Ob5_Pj8rkirodG0uYX0XSlz56dI7ZqoxTfmus7bHwl2uwh9KINlPsvpLLrJRlNxR3NmRR49vEMlqPnupPtjflYB1FSTHakZJ0XUZRMgz9FUOgzx4SlShjiLxuM0jMZJlgz5LOFZkrGhE07ijM6mYVvWtR2mC50aHePMESP50PO2euRnYfTqn4hha-TszZHRxzY9pNNoMh5WM0wiPk7jLJ5OkzIqsBxv0skmm0RX04hnWTaUjFjfzgbpNVVEhYfTHwKqjulyKGZxGMdRGI7DMEmiJGDhVcqyq6TEaXyVTLLBVYgUhjLwOAJttkMz6yAV7dbSR0kNhX3-SDFDZIzYmSP9rKWu28yOj1QMpWiHne1Zh_0vmNxCIg">