[PATCH] Make InstCombine aware of TargetTransformInfo when optimize extension

Fri Jun 26 12:18:13 PDT 2015

What's special about NVPTX is that NVPTX emits PTX, a virtual ISA, instead of real machine code (aka SASS). The CUDA driver JIT compiles PTX to machine code at runtime. Ideally, NVPTX should codegen machine code directly, but that requires SASS ISA to be public.

Therefore, i64 is a legal DL type for PTX, but just more expensive at runtime because the machine code needs two 32-bit registers to simulate an i64. Does LLVM's target-independent IR optimizer implicitly assume legal integer types are equally cheap? I don't think that's a right assumption. IIRC, AMDGPU also has similar issues where widening integers can hurt performance [https://llvm.org/bugs/show_bug.cgi?id=21148]. We solved that by disabling indvar widening if 64-bit arithmetics are more expensive than 32-bit.

I am not sure how viable the second approach (i.e. narrowing the operations as much as possible) is. While widening is sound, narrowing can be unsound in many cases, not mentioning the complexity of bit tracking etc. Anyhow, it doesn't feel right to undo something in certain targets rather than not doing that in the target-independent phase (with TTI checks of course) at the first place.

Alternatively, what do you think about modifying `ShouldChangeType` to disallow such conversion for NVPTX instead of introducing a cost model? `ShouldChangeType` sounds to me like a place we put legality checks. Although i64 is a DL legal type, such sext/zext can be considered "illegal" due to performance reasons.

LMK. Thanks!

http://reviews.llvm.org/D10750

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/