[PATCH] R600: Don't unconditionally unroll loops with private memory accesses
Tom Stellard
tom at stellard.net
Fri Feb 14 07:54:33 PST 2014
From: Tom Stellard <thomas.stellard at amd.com>
This causes the size of the scrypt kernel to explode and eats all the
memory on some systems.
---
lib/Target/R600/AMDGPUTargetTransformInfo.cpp | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/lib/Target/R600/AMDGPUTargetTransformInfo.cpp b/lib/Target/R600/AMDGPUTargetTransformInfo.cpp
index 7f37658..9cdfec5 100644
--- a/lib/Target/R600/AMDGPUTargetTransformInfo.cpp
+++ b/lib/Target/R600/AMDGPUTargetTransformInfo.cpp
@@ -110,9 +110,13 @@ void AMDGPUTTI::getUnrollingPreferences(Loop *L,
// instructions that make it through to the code generator. allocas
// require us to use indirect addressing, which is slow and prone to
// compiler bugs. If this loop does an address calculation on an
- // alloca ptr, then we want to unconditionally unroll the loop. In most
- // cases, this will make it possible for SROA to eliminate these allocas.
- UP.Threshold = UINT_MAX;
+ // alloca ptr, then we want to use a higher than normal loop unroll
+ // threshold. This will give SROA a better chance to eliminate these
+ // allocas.
+ //
+ // Don't use the maximum allowed value here as it will make some
+ // programs way too big.
+ UP.Threshold = 500;
}
}
}
--
1.8.1.5
More information about the llvm-commits
mailing list