[PATCH] D106058: [DAG] Fold select(cond,binop(x,y),binop(x,z)) -> binop(x,select(cond,y,z))
Artem Belevich via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 15 16:14:18 PDT 2021
- Previous message: [PATCH] D106058: [DAG] Fold select(cond,binop(x,y),binop(x,z)) -> binop(x,select(cond,y,z))
- Next message: [PATCH] D106058: [DAG] Fold select(cond,binop(x,y),binop(x,z)) -> binop(x,select(cond,y,z))
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
tra added a comment.
Is it intentional that two `fdiv arcp` get folded into `fdiv` w/o `arcp`?
This is part of what made the difference for NVPTX tests.
SelectionDAG has 21 nodes:
t0: ch = EntryToken
t14: v1f32,ch = load<(dereferenceable invariant load (s32) from `float addrspace(101)* null`, addrspace 101)> t0, TargetExternalSymbol:i32'repeated_div_recip_a_param_3', undef:i32
t15: f32 = extract_vector_elt t14, Constant:i32<0>
t4: v1i8,ch = load<(dereferenceable invariant load (s8) from `i1 addrspace(101)* null`, addrspace 101)> t0, TargetExternalSymbol:i32'repeated_div_recip_a_param_0', undef:i32
t5: i8 = extract_vector_elt t4, Constant:i32<0>
t6: i1 = truncate t5
t8: v1f32,ch = load<(dereferenceable invariant load (s32) from `float addrspace(101)* null`, addrspace 101)> t0, TargetExternalSymbol:i32'repeated_div_recip_a_param_1', undef:i32
t9: f32 = extract_vector_elt t8, Constant:i32<0>
t16: f32 = fdiv arcp t9, t15
t11: v1f32,ch = load<(dereferenceable invariant load (s32) from `float addrspace(101)* null`, addrspace 101)> t0, TargetExternalSymbol:i32'repeated_div_recip_a_param_2', undef:i32
t12: f32 = extract_vector_elt t11, Constant:i32<0>
t17: f32 = fdiv arcp t12, t15
t18: f32 = select t6, t16, t17
t19: ch = NVPTXISD::StoreRetval<(store (s32), align 1)> t0, Constant:i32<0>, t18
t20: ch = NVPTXISD::RET_FLAG t19
Combining: t20: ch = NVPTXISD::RET_FLAG t19
Combining: t19: ch = NVPTXISD::StoreRetval<(store (s32), align 1)> t0, Constant:i32<0>, t18
Combining: t18: f32 = select t6, t16, t17
Creating new node: t21: f32 = select t6, t9, t12
Creating new node: t22: f32 = fdiv t21, t15
... into: t22: f32 = fdiv t21, t15
Without `arcp` we have no choice now but to lower into a regular div instruction.
That said, even if we were to preserve `arcp`, we'd run into the second issue.
NVPTX itself does not know how to lower `FDIV32rr_prec arcp` correctly and lowers it as a regular `div`.
Previously two divs+select were combined into two multiplications by reciprocal and that was what made it look like we can lower div to multiplication by reciprocal.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D106058/new/
https://reviews.llvm.org/D106058
- Previous message: [PATCH] D106058: [DAG] Fold select(cond,binop(x,y),binop(x,z)) -> binop(x,select(cond,y,z))
- Next message: [PATCH] D106058: [DAG] Fold select(cond,binop(x,y),binop(x,z)) -> binop(x,select(cond,y,z))
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
More information about the llvm-commits
mailing list