[PATCH] R600/SI: Implement less wrong f32 fdiv
Matt Arsenault
Matthew.Arsenault at amd.com
Mon Jun 23 11:45:58 PDT 2014
Assuming single precision denormals and accurate sqrt/div are not reported, this passes the OpenCL conformance test.
The f64 version is currently still stuck on broken handling of the div_scale instructions. The legalizing of instruction operands is too strict and only allows one SGPR operand, when the restriction is that only 1 SGPR may be read from, but that single SGPR can be used for multiple operands. When it does legalize div_scale, it produces a copy for one of the SGPR operands, so that the first is no longer the same operand as the second or third.
http://reviews.llvm.org/D4260
Files:
lib/Target/R600/SIISelLowering.cpp
lib/Target/R600/SIISelLowering.h
lib/Target/R600/SIInstructions.td
test/CodeGen/R600/fdiv.ll
test/CodeGen/R600/llvm.AMDGPU.rcp.ll
test/CodeGen/R600/rsq.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D4260.10759.patch
Type: text/x-patch
Size: 11143 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140623/fa373570/attachment.bin>
More information about the llvm-commits
mailing list