<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Relax ARM NEON literal rules"
   href="https://bugs.llvm.org/show_bug.cgi?id=44607">44607</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Relax ARM NEON literal rules
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>9.0
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>Other
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>C
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>husseydevin@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>blitzrakete@gmail.com, dgregor@apple.com, erik.pilkington@gmail.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Currently, the NEON "constant" restrictions are too strict compared to SSE2 and
GCC.

#include <arm_neon.h>

static inline uint32x4_t shift(uint32x4_t inp, const int amt)
{
    return vshlq_n_u32(inp, amt);
}

int main()                                                                    {
    uint32x4_t val = vdupq_n_u32(2384);
    uint32x4_t shifted = shift(val, 3);
}

`shift` should be constant propagated, and Clang should accept this code.

GCC accepts this code, and Clang also accepts the SSE2 equivalent:

#include <emmintrin.h>

static inline __m128i shift(__m128i val, int amt)
{
    return _mm_slli_epi32(val, amt);
}

int main()
{
    __m128i val = _mm_set1_epi32(2384);
    __m128i shifted = shift(val, 3);
}

However, I get this with Clang 9.0.1 on Termux aarch64:

neon.cpp:7:12: error: argument to '__builtin_neon_vshlq_n_v' must be a
      constant integer
    return vshlq_n_u32(inp, amt);
           ^                ~~~
/data/data/com.termux/files/usr/lib/clang/9.0.1/include/arm_neon.h:24327:24:
note:
      expanded from macro 'vshlq_n_u32'
  __ret = (uint32x4_t) __builtin_neon_vshlq_n_v((int8x16_t)__s0, __p1, 50); \
                       ^                                         ~~~~
1 error generated.


In addition, GCC also converts some things to the non-literal forms. If I
remove the static inline part, I get the following assembly:

shift:
        dup     v1.4s, w0
        sshl    v0.4s, v0.4s, v1.4s
        ret

This strict literal requirement makes things difficult for things like C++
wrappers, and the requirements should be relaxed like GCC and SSE2.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>