[all-commits] [llvm/llvm-project] 356254: [X86] Fix the cost model for v16i16->v16i32 zero_e...

Wed Jan 29 15:52:48 PST 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 35625464c6ddef557c2369946681be5cfb42d5c1
      https://github.com/llvm/llvm-project/commit/35625464c6ddef557c2369946681be5cfb42d5c1
  Author: Craig Topper <craig.topper at intel.com>
  Date:   2020-01-29 (Wed, 29 Jan 2020)

  Changed paths:
    M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/X86/arith-fix.ll
    M llvm/test/Analysis/CostModel/X86/arith-overflow.ll
    M llvm/test/Analysis/CostModel/X86/cast.ll
    M llvm/test/Analysis/CostModel/X86/extend.ll
    M llvm/test/Analysis/CostModel/X86/min-legal-vector-width.ll

  Log Message:
  -----------
  [X86] Fix the cost model for v16i16->v16i32 zero_extend/sign_extend with AVX2

We seem to be inheriting the cost from sse4.1. But if we have 256-bit registers we should be able to do this with just one extract to split the 16i16 and two v8i16->v8i32 operations so our cost should be 3 not 4.

Differential Revision: https://reviews.llvm.org/D73646

  Commit: a10cec02f79082a1da571e44e68800025b7e6554
      https://github.com/llvm/llvm-project/commit/a10cec02f79082a1da571e44e68800025b7e6554
  Author: Craig Topper <craig.topper at intel.com>
  Date:   2020-01-29 (Wed, 29 Jan 2020)

  Changed paths:
    M clang/lib/CodeGen/CGBuiltin.cpp
    A clang/test/CodeGen/avx-builtins-constrained-cmp.c
    A clang/test/CodeGen/avx512f-builtins-constrained-cmp.c
    A clang/test/CodeGen/avx512vl-builtins-constrained-cmp.c
    A clang/test/CodeGen/sse-builtins-constrained-cmp.c
    A clang/test/CodeGen/sse2-builtins-constrained-cmp.c

  Log Message:
  -----------
  [X86] Improve X86 cmpps/cmppd/cmpss/cmpsd intrinsics with strictfp

The constrained fcmp intrinsics don't allow the TRUE/FALSE predicates.
Using them will assert. To workaround this I'm emitting the old X86 specific intrinsics that were never removed from the backend when we switched to using fcmp in IR. We have no way to mark them as being strict, but that's true of all target specific intrinsics so doesn't seem like we need to solve that here.

I've also added support for selecting between signaling and quiet.

Still need to support SAE which will require using a target specific
intrinsic. Also need to fix masking to not use an AND instruction
after the compare.

Differential Revision: https://reviews.llvm.org/D72906

Compare: https://github.com/llvm/llvm-project/compare/228ea1a46cc8...a10cec02f790