[PATCH] D30440: AMDGPU: Fix unnecessary ands when packing f16 vectors
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 27 19:10:01 PST 2017
arsenm created this revision.
Herald added subscribers: tpr, dstuttard, tony-tye, yaxunl, nhaehnle, wdng, kzhuravl, aemerson.
computeKnownBits didn't handle fp_to_fp16 to report
the high bits as 0. ARM maps the generic node to an instruction
that does not modify the high bits of the register, so introduce
a target node where the high bits are known 0.
https://reviews.llvm.org/D30440
Files:
lib/Target/AMDGPU/AMDGPUISelLowering.cpp
lib/Target/AMDGPU/AMDGPUISelLowering.h
lib/Target/AMDGPU/AMDGPUInstrInfo.td
lib/Target/AMDGPU/EvergreenInstructions.td
lib/Target/AMDGPU/SIInstructions.td
lib/Target/AMDGPU/VOP1Instructions.td
test/CodeGen/AMDGPU/fadd.f16.ll
test/CodeGen/AMDGPU/fmul.f16.ll
test/CodeGen/AMDGPU/fptrunc.f16.ll
test/CodeGen/AMDGPU/fsub.f16.ll
test/CodeGen/AMDGPU/llvm.ceil.f16.ll
test/CodeGen/AMDGPU/llvm.cos.f16.ll
test/CodeGen/AMDGPU/llvm.exp2.f16.ll
test/CodeGen/AMDGPU/llvm.exp2.ll
test/CodeGen/AMDGPU/llvm.floor.f16.ll
test/CodeGen/AMDGPU/llvm.fma.f16.ll
test/CodeGen/AMDGPU/llvm.fmuladd.f16.ll
test/CodeGen/AMDGPU/llvm.log2.f16.ll
test/CodeGen/AMDGPU/llvm.maxnum.f16.ll
test/CodeGen/AMDGPU/llvm.minnum.f16.ll
test/CodeGen/AMDGPU/llvm.rint.f16.ll
test/CodeGen/AMDGPU/llvm.sin.f16.ll
test/CodeGen/AMDGPU/llvm.sqrt.f16.ll
test/CodeGen/AMDGPU/llvm.trunc.f16.ll
test/CodeGen/AMDGPU/select.f16.ll
test/CodeGen/AMDGPU/sitofp.f16.ll
test/CodeGen/AMDGPU/uitofp.f16.ll
test/CodeGen/AMDGPU/v_mac_f16.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D30440.89963.patch
Type: text/x-patch
Size: 52408 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170228/41aca2ca/attachment-0001.bin>
More information about the llvm-commits
mailing list