[all-commits] [llvm/llvm-project] d60b3b: [X86] Add isel patterns for bitcasting between v32...

topperc via All-commits all-commits at lists.llvm.org
Wed Jan 8 10:06:46 PST 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: d60b3b4817cb9346b682bb75371c41642c273b13
      https://github.com/llvm/llvm-project/commit/d60b3b4817cb9346b682bb75371c41642c273b13
  Author: Craig Topper <craig.topper at intel.com>
  Date:   2020-01-08 (Wed, 08 Jan 2020)

  Changed paths:
    M llvm/lib/Target/X86/X86InstrAVX512.td
    M llvm/test/CodeGen/X86/avx512bw-mask-op.ll

  Log Message:
  -----------
  [X86] Add isel patterns for bitcasting between v32i1/v64i1 and float/double.

We have to do an intermediate jump to a GPR to make the cast.

Fixes PR43750.


  Commit: 3811417f39a7d0a370fac2923060f5ef8dacd8d7
      https://github.com/llvm/llvm-project/commit/3811417f39a7d0a370fac2923060f5ef8dacd8d7
  Author: Craig Topper <craig.topper at intel.com>
  Date:   2020-01-08 (Wed, 08 Jan 2020)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/test/CodeGen/X86/vec_int_to_fp.ll

  Log Message:
  -----------
  [X86] Custom type legalize v4i64->v4f32 uint_to_fp on sse4.1 targets in 64-bit mode

For v4i64->v4f32 uint_to_fp on pre-avx targets where v4i64 isn't legal we create to v2i64->v2f32 uint_to_fp that need to be shuffled together. Our codegen for v2i64->v2f32 involves detecting if the number is larger than (2^31 - 1), if so we do a special divison by 2 so we can do a signed conversion which we need to scalarize, then do a multiply by 2 at the end if we divided earlier.

When v4i64 isn't legal we need to split the checking for a larger number and dividing by 2 into two v2i64 vectors. The scalar part can extract the 4 i64 values from those 4 splits. But we can reassemble the 4 scalar f32 results directly into a single v432 vector. Then we just need to combine the fixup indications from the 2 halves and we can do the final multiply by 2 fixup on all 4 values if needed at once using a single v4f32 blend and v4f32 fadd.

Differential Revision: https://reviews.llvm.org/D72368


Compare: https://github.com/llvm/llvm-project/compare/29ccb12e2c12...3811417f39a7


More information about the All-commits mailing list