[PATCH] Model sqrtsd as a binary operation with one source operand tied to the destination (PR14221)

Sanjay Patel spatel at rotateright.com
Sun Jan 11 15:57:12 PST 2015


In http://reviews.llvm.org/D6885#107200, @mkuper wrote:

> Do you think we can fix the correctness issue without introducing extra copies?


Hi Michael -
Thanks for looking at this patch. I agree that we'll produce suboptimal code for your test case, and I think this will match the existing suboptimal codegen for the equivalent test cases using sqrtss, rsqrtss, and rcpss. I'm not sure how to solve this; it seems like it will require writing custom lowering code rather than tablegen patterns? A few points to consider before we tackle that:

1. Correctness is more important than the minor perf problem.

2. The perf problem is limited only to legacy SSE codegen; with AVX, we don't have to worry about the implicit source op.

3. There's some weirdness in particular with the sqrtsd case: Intel/AMD defined the C intrinsic with 2 input operands. Why is the LLVM intrinsic defined with only 1 input? If we fix that definition, I think we can solve the problem for sqrtsd (but not sqrtss, rsqrtss, rcpss).

My preference is to commit this patch to preserve correctness (finally close PR14221) and then file the perf bug separately...unless you see a quick fix that we can implement to solve all of the problems in one patch?


http://reviews.llvm.org/D6885

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list