[PATCH] D73713: Fixed non-deterministic GPU intrinsic lowering.
Julian Gross via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 6 03:36:55 PST 2020
dfki-jugr added a comment.
@rriddle I've tried to come up with a minimal example including a new dialect. However, the issue does not occur in simple cases on our testing machines. The easiest way to dive into this issue would be to move line 728 in LowerGPUOpsToNNVMOps.cc to 750:
// move this line
// populateWithGenerated(converter.getDialect()->getContext(), &patterns);
patterns
.insert<GPUIndexIntrinsicOpLowering<gpu::ThreadIdOp, NVVM::ThreadIdXOp,
NVVM::ThreadIdYOp, NVVM::ThreadIdZOp>,
GPUIndexIntrinsicOpLowering<gpu::BlockDimOp, NVVM::BlockDimXOp,
NVVM::BlockDimYOp, NVVM::BlockDimZOp>,
GPUIndexIntrinsicOpLowering<gpu::BlockIdOp, NVVM::BlockIdXOp,
NVVM::BlockIdYOp, NVVM::BlockIdZOp>,
GPUIndexIntrinsicOpLowering<gpu::GridDimOp, NVVM::GridDimXOp,
NVVM::GridDimYOp, NVVM::GridDimZOp>,
GPUAllReduceOpLowering, GPUShuffleOpLowering, GPUFuncOpLowering,
GPUReturnOpLowering>(converter);
patterns.insert<OpToFuncCallLowering<AbsFOp>>(converter, "__nv_fabsf",
"__nv_fabs");
patterns.insert<OpToFuncCallLowering<CeilFOp>>(converter, "__nv_ceilf",
"__nv_ceil");
patterns.insert<OpToFuncCallLowering<CosOp>>(converter, "__nv_cosf",
"__nv_cos");
patterns.insert<OpToFuncCallLowering<ExpOp>>(converter, "__nv_expf",
"__nv_exp");
patterns.insert<OpToFuncCallLowering<TanhOp>>(converter, "__nv_tanhf",
"__nv_tanh");
// to here
populateWithGenerated(converter.getDialect()->getContext(), &patterns);
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73713/new/
https://reviews.llvm.org/D73713
More information about the llvm-commits
mailing list