[PATCH] D73978: [WIP][FPEnv] Don't transform FSUB(-0.0,X)->FNEG(X) when flushing denormals

Mon Feb 24 15:15:29 PST 2020

cameron.mcinally added a comment.

Thanks for that patch, Sanjay.

I have another issue which I hope you can help me sort out. There's a transform in `narrowExtractedVectorBinOp(...)` in DAGCombiner.cpp:

`// extract (binop B0, B1), N --> binop (extract B0, N), (extract B1, N)`

This transform only happens for binops, so we don't see it when SelectionDAGBuilder converts the FSUB->FNEG.

The IR is...

  %rhs_neg = fsub <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>, %rhs
  %splat = shufflevector <4 x float> %rhs_neg, <4 x float> undef, <2 x i32> <i32 3, i32 3>

and after DAGCombine we end up with DAGs like this...

  FNEG:
  <               t9: v4f32 = bitcast t8
  <             t24: v4f32 = fneg t9
  <           t15: v2f32 = extract_subvector t24, Constant:i64<2>
  <         t17: v2f32 = vector_shuffle<1,1> t15, undef:v2f32

  FSUB:
  >               t29: v1i64 = extract_subvector t8, Constant:i64<1>
  >             t30: v2f32 = bitcast t29
  >           t32: v2f32 = fneg t30
  >         t17: v2f32 = vector_shuffle<1,1> t32, undef:v2f32

Moving the extract to the operands (FSUB) is a problem on AArch64 since the extract **could** be rolled into the shuffle (FNEG). E.g.:

  FNEG:
  <             t9: v4f32 = bitcast t8
  <           t24: v4f32 = fneg t9
  <         t26: v2f32 = AArch64ISD::DUPLANE32 t24, Constant:i64<3>

  FSUB:
  >                 t29: v1i64 = extract_subvector t8, Constant:i64<1>
  >               t30: v2f32 = bitcast t29
  >             t32: v2f32 = fneg t30
  >           t36: v4f32 = insert_subvector undef:v4f32, t32, Constant:i32<0>
  >         t37: v2f32 = AArch64ISD::DUPLANE32 t36, Constant:i64<1>

Any insight on the best way to correct this difference? I suppose I could fix up the extract+insert at the MachineInstruction level, but that doesn't seem like the correct fix since other targets could have the same problem.

I'm also a little skeptical about moving the extracts to the operands, and if it's a win in the general case. Seems like it would be stronger after any extract+insert peeps have occurred, but I suppose that's why it's done in DAGCombine. :/

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73978/new/

https://reviews.llvm.org/D73978