[PATCH] R600/SI: Implement less wrong f32 fdiv

Matt Arsenault Matthew.Arsenault at amd.com
Mon Jun 23 11:45:58 PDT 2014


Assuming single precision denormals and accurate sqrt/div are not reported, this passes the OpenCL conformance test.

The f64 version is currently still stuck on broken handling of the div_scale instructions. The legalizing of instruction operands is too strict and only allows one SGPR operand, when the restriction is that only 1 SGPR may be read from, but that single SGPR can be used for multiple operands. When it does legalize div_scale, it produces a copy for one of the SGPR operands, so that the first is no longer the same operand as the second or third.

http://reviews.llvm.org/D4260

Files:
  lib/Target/R600/SIISelLowering.cpp
  lib/Target/R600/SIISelLowering.h
  lib/Target/R600/SIInstructions.td
  test/CodeGen/R600/fdiv.ll
  test/CodeGen/R600/llvm.AMDGPU.rcp.ll
  test/CodeGen/R600/rsq.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D4260.10759.patch
Type: text/x-patch
Size: 11143 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140623/fa373570/attachment.bin>


More information about the llvm-commits mailing list