<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/97299>97299</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [TypeLegalizer] Wrong code for certain non-power-of-2 sized int->vec BITCAST
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          uweigand
      </td>
    </tr>
</table>

<pre>
    On a big-endian machine, the following code:
```
define i1 @test(iN %in) {
  %vec = bitcast iN %in to <N x i1>
  %bit = extractelement <N x i1> %vec, i32 0
  ret i1 %bit
}
```

should, for any choice of `N`, extract the most significant bit of the input.   And this does indeed work correctly in most cases, specifically when N is either >= 16 or else a power of two.

However, for non-power-of-two sizes less than 16, at least on s390x, I see wrong code generation.   What distinguishes the cases is that for power of two sizes, the <N x i1> vector is either already legal or else handled via `SplitVecRes_BITCAST` (which looks to be correct).  For non power of two sizes, the `WidenVecRes_BITCAST` routine is invoked instead, where we typically fall into this case:
```
    // If the InOp is promoted to the same size, convert it.  Otherwise,
    // fall out of the switch and widen the promoted input.
    SDValue NInOp = GetPromotedInteger(InOp);
    EVT NInVT = NInOp.getValueType();
    if (WidenVT.bitsEq(NInVT)) {
      // For big endian targets we need to shift the input integer or the
      // interesting bits will end up at the wrong place.
      if (DAG.getDataLayout().isBigEndian()) {
 unsigned ShiftAmt = NInVT.getSizeInBits() - InVT.getSizeInBits();
 EVT ShiftAmtTy = TLI.getShiftAmountTy(NInVT, DAG.getDataLayout());
 assert(ShiftAmt < WidenVT.getSizeInBits() && "Too large shift amount!");
        NInOp = DAG.getNode(ISD::SHL, dl, NInVT, NInOp,
 DAG.getConstant(ShiftAmt, dl, ShiftAmtTy));
 }
      return DAG.getNode(ISD::BITCAST, dl, WidenVT, NInOp);
    }
 InOp = NInOp;
    InVT = NInVT;
    break;
```

If N is larger than 16, then `WidenVT.bitsEq(NInVT)` check succeeds and we get a simple shift and bitcast, yielding the correct result.

However, for smaller values of N, the problem is that `WidenVT` is either `v8i1` or `v16i1`, but `NInVT` is always `i32` on s390x (since we don't support native arithmetic on smaller integer types, all integers are promoted to at least `i32`).   In that case, the check fails and we fall through the code below, with `InOp` now replaced by the `i32` version of the input.

The next block is guarded by `  if (WidenSize % InScalarSize == 0`, which is also false since WidenSize is actually *smaller* than `InScalarSize` - the latter has been replaced with the widened size of `i32`.   Therefore we fall through to the end and simply do:

```
  return CreateStackStoreLoad(InOp, WidenVT);
```

This doesn't appear to account for the possibility that `InOp` has a different (larger!) in-memory store size than `WidenVT` at all.   It simply stores the widened input and loads the result from its first bytes, which on a big-endian system are typically just the padding bits.

I'm now wondering how to fix this.  Ideally, this should result is just a shift (and mask) for those inputs as well.  But it if has to fall through to the store/load method, I guess that should handle size differences somehow ...

Any suggestions welcome!

@topperc @arsenm @mikaelholmen @efriedma-quic 

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJx8V91u47gOfhr3hkjgKE2aXOSibaa7AQbdg9Ogc3kgW3SsrSx5JTqezNMfULLzM9uZokgbSaTIjz8fJUPQB4u4yRZP2WJ7Jzuqnd90PeqDtOqucOq0-cuChEIfJmiVlhYaWdbaYiaegWqEyhnjem0PUDqF2fwxy7dZ_pgt8-E3flVYaYugZ5Dd54SBMrHSr5CJhbaZWEP28JQOAq8dsYRsvoVCUykDwXgSyEE2f36F76Bn2fzLlUihKYrgd_KyJDTYoKWb04NmNlzPBeSjtEeKhkUlg_UP20_dSJ-hdp1RrKdyHqQ9QVk7XSK4CrJl_srHxfNoSkSpcYGAwdaVLqUldo2P8562bUdTAHi0CqjWAZTDANoqRAW98x9QOu-xJHMCbZOuUgYMfEtosYxKjTlBX6OFV9ABUFONHhik-RZmS3Ae0AQECa3r0cfLeze99utP1-MR_eiYdXYSz05cNaHeQdA_MIDBEIBqaWG25KOSwCBHyVkI83X-nRd3EBCh927ICzigRS9JO8uefqslgdKBtD10OtQYIhLRKbaeeJ9tuLY13T_m3U1kj1iS81d-S-NRqhMYPEhz9r2WVhlUcNSSA_XWGk3vWP4Xw_-edvvnx7d9tswhE6u-1mUNxrmPwDlX4BiBTKynAC8Jnd9Zt8y_aYX239q96yiWAkf46D5QgbaBUMaE6mv0CD0CndohppU0BrQll3KDMfpVlQFwLbxk4gV2Kbd29q-Wr2q9axyhgqgGIcgGo8V8aensET2B5iz8i_HrdeCdf2mNtrjunLqh11TWIK2Cnt2Ni-e7UmJflLxt36XpEF6jVZyXfyD9Zzi9s4QHTr4V72Zinc2fLqJf3vcs9r6PYlHB9IAU9e1PLWZi9bOIrjiSKQr7aaEpfPknE6uohc_edp0rJzm6hT7A0O9I-gNS4KBYTAiGWld0qV2ODtvOeUY1fqKSD3iM6c6VH6DXxvAF0LVcQKwqFUtrZInTaxXJj-3jH-zwVpL8Kk-uo-TxVIcnffgSLR0wuPGrs7HDK3hjkx8bGvF737O6N_0Dd_ZJU0jCMIFfbV2w5ViM6vanqHD_dRdl0qrrLO1PF6yf4RfW36iVIaDnjStTn2GM32fGZmKZiSVkQuydA8NxGkIjow2ZmGVC_JwX_HPJwMGyV-Yusdq9bbm25o9vf35lu5Xhz7Mbryk1x7oYZJ-dDSTttekX2QtQP_t75phkkkfqvP2VPWMDOesdcLmy6tbLi_azq-nc9aHrgnrf32wVHuXHeeVTGtxViWoi8P6aEohp6NwCPyu-ZQ5ljeUHhK4sEVVIPYR5gkBC0E1rzsG0apwEWPlJo1FcSJExUlsGj6Ez9Ds2C400Bj0cuWUEbmCvY7NuvSsMNmfiuVjOdl6R6TI_rvSMF136NlvGr6yo6KJg8jCJSdPLU-BVPRdRamBILuigbRk7vXI2Ew8EoWtb5wmsJH1EkF5T3SDpMooN1o-dhk5tIpuBHHgxgPR40-nP1Hw2IfEX7GzyNJLJgEKKRyW1OQcjtnuqvesO9QC3QijQuD6SlaaaVce8WuZgXQ8eYwdTUJxGIhycP6IP2tnbqec6YPuaO-x3gsK48oMBPHTSq6SLNVy3dG4GPLLBzr6V0kifFuZbTud8iEmi8RiJ4NibwKTHuF908G5JXaTaTDwOQGfiMSV0dO9yA5sxifYbSYQeahmgQLQXxyMqsaHzHagizQ6jYYKCI7Bnoq9cIvtbnBNDMzlwGGIpnEC5C-l_Tv1DA3n2KAnfSJYfb-Q8fnU8WQyketU31r8v7v04h6bklG2L0secKkvurbGmYvG4EHShjabTuXrGhGBwJChdVejjNC5WqVnEzrwGbScNNs6fILCpCakR9qsSlMR5HhOXRkCiRLgBOpExg2acVGkvNQaovGuAibfSPhAUJ0rlkzLE_fTECadA2MRyusxhf3chEXUrlRqJ_CaBd5l4aGIV9M4q9Hyodj2jVunvcXybAuwUsr5UdjpAek2MhuqQLpJD88vEih1qZPhgxBLqLgz1E0DyYBKxeeJBhLhGGHZyn2ZVRC0TLwwQNEi1U2lcP3TDWE-jQWlYTjEZQ1higOAaZK-m0xvnH-0JQnc48JTjbLSqdA1ypK-z9j4n17boS34ISh_QNvxfoz8kmtqZhpnjPsfKa1SNnPzT6RKS7J3azNV6vpZ3uJk9zNar-_liPburNyuxFGIppZCzqlgslqVQy2W-yJf5uhL5vbjTG5GL-_whn83uZ-v79XSxfshn-X0lF-uiKpeCb2ykNlNjjs3U-cOdDqHDzfpBrNd3RhZoQnwkC2Gxh7jJo8Vie-c3LDMpukPI7nOjA4WLFtJk4uuaZ9Sv_BbRP9Bniy18uzyNOKYlepLa3r65RASf85om2fwLv4qHOeCu82ZTE7WBu0KcMQ-a6q6Ylq7h8Jrj-GfSevd3fLm8RLNDJl6SW8eN-H8AAAD__1KLC-g">