[PATCH] D30440: AMDGPU: Fix unnecessary ands when packing f16 vectors

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 27 19:10:01 PST 2017


arsenm created this revision.
Herald added subscribers: tpr, dstuttard, tony-tye, yaxunl, nhaehnle, wdng, kzhuravl, aemerson.

computeKnownBits didn't handle fp_to_fp16 to report
the high bits as 0. ARM maps the generic node to an instruction
that does not modify the high bits of the register, so introduce
a target node where the high bits are known 0.


https://reviews.llvm.org/D30440

Files:
  lib/Target/AMDGPU/AMDGPUISelLowering.cpp
  lib/Target/AMDGPU/AMDGPUISelLowering.h
  lib/Target/AMDGPU/AMDGPUInstrInfo.td
  lib/Target/AMDGPU/EvergreenInstructions.td
  lib/Target/AMDGPU/SIInstructions.td
  lib/Target/AMDGPU/VOP1Instructions.td
  test/CodeGen/AMDGPU/fadd.f16.ll
  test/CodeGen/AMDGPU/fmul.f16.ll
  test/CodeGen/AMDGPU/fptrunc.f16.ll
  test/CodeGen/AMDGPU/fsub.f16.ll
  test/CodeGen/AMDGPU/llvm.ceil.f16.ll
  test/CodeGen/AMDGPU/llvm.cos.f16.ll
  test/CodeGen/AMDGPU/llvm.exp2.f16.ll
  test/CodeGen/AMDGPU/llvm.exp2.ll
  test/CodeGen/AMDGPU/llvm.floor.f16.ll
  test/CodeGen/AMDGPU/llvm.fma.f16.ll
  test/CodeGen/AMDGPU/llvm.fmuladd.f16.ll
  test/CodeGen/AMDGPU/llvm.log2.f16.ll
  test/CodeGen/AMDGPU/llvm.maxnum.f16.ll
  test/CodeGen/AMDGPU/llvm.minnum.f16.ll
  test/CodeGen/AMDGPU/llvm.rint.f16.ll
  test/CodeGen/AMDGPU/llvm.sin.f16.ll
  test/CodeGen/AMDGPU/llvm.sqrt.f16.ll
  test/CodeGen/AMDGPU/llvm.trunc.f16.ll
  test/CodeGen/AMDGPU/select.f16.ll
  test/CodeGen/AMDGPU/sitofp.f16.ll
  test/CodeGen/AMDGPU/uitofp.f16.ll
  test/CodeGen/AMDGPU/v_mac_f16.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D30440.89963.patch
Type: text/x-patch
Size: 52408 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170228/41aca2ca/attachment-0001.bin>


More information about the llvm-commits mailing list