[PATCH] D54719: [Intrinsic] Signed Fixed Point Multiplication Intrinsic

Wed Nov 21 01:37:41 PST 2018

ebevhan added inline comments.

================
Comment at: llvm/docs/LangRef.rst:12629
+less than the source value, or rounded ip to the closest fixed point value
+greater than the source value. This rounding is dependent on the target.
+
----------------
leonardchan wrote:
> ebevhan wrote:
> > This doc doesn't say what happens on overflow. Truncation/wraparound? Or should we consider it undefined?
> > 
> > If it says that the rounding is dependent on the target, will the rounding mode be target-configurable and exposed through TLI/TTI? We essentially can't touch these intrinsics (constant folding, optimization, even legalization) otherwise.
> > 
> > Looking at the legalization sequence, it will definitely lower these into a round-down form. If a target has some legal operations which round up and some non-legal operations, then the legal ones will round up and the non-legal ones will round down. That's sort of messy.
> > 
> > It might be safer to say that the rounding is indeterminate, but that's even worse for optimization.
> Overflow I was going to leave as undefined behavior. Added this into the docs.
> 
> By target dependent, I meant that whatever legal operation a target could replace this intrinsic with has the choice of rounding up or down. I'm not sure how bad this would be for optimization, but I imagine rounding isn't something that needs to be touched immediately since with this intrinsic, I'm just attempting to match what the standard says. I think one of the conclusions we also came up to in the long thread on llvm-dev was that, for now, we don't need to do more than what's necessary to implement the spec.
> 
> I changed the docs though to say rounding is indeterminable, as the spec says.
Okay. If we claim that it's indeterminate I guess we can still fold however we like, but the results would seem a bit inconsistent.

================
Comment at: llvm/docs/LangRef.rst:12635
+When the source value does not fit within the range of the fixed point type,
+the conversion overflows and is undefined behavior.
+
----------------
I think "operation" might be clearer than "conversion". Or don't mention it at all and just say "It is undefined behavior if the source value does not fit within the range of the fixed point type."

The rounding section is also a bit wordy. "If the result value cannot be precisely represented in the given scale, the value is rounded up or down to the closest representable value. The rounding direction is unspecified."

Also, my choice of the word "indeterminate" was a bit unfortunate, I think the rest of the LangRef uses "unspecified".

================
Comment at: llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:2543
+          DAG.getNode(ISD::MUL, dl, VT, LHS, RHS).getNode(), Lo, Hi, NVT, DAG,
+          TargetLowering::MulExpansionKind::OnlyLegalOrCustom, LL, LH, RL, RH))
+    return;
----------------
leonardchan wrote:
> ebevhan wrote:
> > Maybe I'm missing something here, but isn't this just a normal, expanded multiplication?
> This should actually be an expansion of SMUL_LOHI since we need the high bits when scaling down the result after multiplication. Updated the code, but one thing I found was that `expandMUL_LOHI` depends on `ADDC` but does not have a check for it. When rerunning the tests, it seems that X86 does not support 32 bit `ADDE` (and `ADDC`).
> 
> ```
> LLVM ERROR: Cannot select: t81: i32,glue = adde t121, t44:1, t80:1
> ```
> 
> so I removed these from the lit tests, but if we want to support expansion of this, I'm not sure if the solution is to add a check into `expandMUL_LOHI` to check `isOperationLegalOrCustom(ISD::ADDC/E, VT)` and if it is not legal, write out the expanded version. I found a comment in `ExpandIntRes_ADDSUB()` saying that there currently does not appear to be a way of generating a value of `MTV::Glue` if we were to expand the result.
It's a bit odd that x86 doesn't support ADDC/ADDE. Those nodes are deprecated in favor of ADDCARRY, so expandMUL_LOHI should probably be making expansions with ADDCARRY instead.

@craig.topper probably knows more about this.

================
Comment at: llvm/test/CodeGen/X86/smul_fix.ll:28
+; X86-NEXT:    retl
+  %tmp = call i32 @llvm.smul.fix.i32(i32 %x, i32 %y, i32 2);
+  ret i32 %tmp;
----------------
Interesting that the 32-bit target produces better code in most cases.

Repository:
  rL LLVM

https://reviews.llvm.org/D54719