[llvm-bugs] [Bug 30506] <intrin.h> does not declare _tzcnt_u32

via llvm-bugs llvm-bugs at lists.llvm.org
Thu Oct 10 23:24:49 PDT 2019


https://bugs.llvm.org/show_bug.cgi?id=30506

Craig Topper <craig.topper at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #6 from Craig Topper <craig.topper at gmail.com> ---
I've removed the __BMI__ check on these intrinsics in r374516. But I don't
think it really accomplishes what ffmpeg wanted.

On real MSVC, using _tzcnt_u32 blindly emits the F3 0F BC tzcnt encoding which
is equivalent to rep+bsf. MSVC doesn't have a concept of enabling features on
the command line. Using an intrinsic always generates the instruction. On
legacy prefixes, the f3 prefix is ignored and the behavior for an input of 0
won't match tzcnt. ffmpeg knows this, but doesn't send 0 so this fine.

clang on the other hand has CPU features. We turn the _tzcnt_u32 intrinsic into
llvm.cttz in IR with a defined behavior for 0. If tzcnt isn't enabled on the
command line, the backend has to emulate the 0 behavior and we won't emit the
tzcnt instruction. So we end up with bsf+cmov and a few other instructions.
Worse than the code we would have gotten if ffmpeg had just used __builtin_ctz
which is undefined for 0 and would just generate a bsf.

If I recall correctly gcc will emit __builtin_ctz as rep+bsf with certain
-march options possibly including the default. Specifically written that way to
support older versions of binutils that don't support tzcnt mnemonic since the
user never mentioned tzcnt anywhere. On CPUs that support tzcnt this will
decode as tzcnt on older CPUs it will be treated as bsf. tzcnt has better
throughput on some AMD CPUs than bsf. Which is probably the very reason ffmpeg
used the tzcnt intrinsic instead of the bsf intrinsic?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20191011/051b9e18/attachment.html>


More information about the llvm-bugs mailing list