[llvm] [Transforms] LoopIdiomRecognize recognize strlen and wcslen (PR #108985)
via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 11 07:46:02 PDT 2024
================
@@ -1524,6 +1545,232 @@ static Value *matchCondition(BranchInst *BI, BasicBlock *LoopEntry,
return nullptr;
}
+/// Recognizes a strlen idiom by checking for loops that increment
+/// a char pointer and then subtract with the base pointer.
+///
+/// If detected, transforms the relevant code to a strlen function
+/// call, and returns true; otherwise, returns false.
+///
+/// The core idiom we are trying to detect is:
+/// \code
+/// start = str;
+/// do {
+/// str++;
+/// } while(*str != '\0');
+/// \endcode
+///
+/// The transformed output is similar to below c-code:
+/// \code
+/// str = start + strlen(start)
+/// len = str - start
+/// \endcode
+///
+/// Later the pointer subtraction will be folded by InstCombine
+bool LoopIdiomRecognize::recognizeAndInsertStrLen() {
+ if (DisableLIRPStrlen)
+ return false;
+
+ // Give up if the loop has multiple blocks or multiple backedges.
+ if (CurLoop->getNumBackEdges() != 1 || CurLoop->getNumBlocks() != 1)
+ return false;
+
+ // It should have a preheader containing nothing but an unconditional branch.
+ auto *Preheader = CurLoop->getLoopPreheader();
+ if (!Preheader || &Preheader->front() != Preheader->getTerminator())
+ return false;
----------------
RolandF77 wrote:
This condition seems unnecessarily restrictive. For example, it gets the idiom in check1 but not check2:
#include <stdio.h>
#include <string.h>
void check1(char *s) {
char *p = s;
do {
p++;
} while (*p);
printf("%s %ld\n", s, strlen(s));
printf ("%ld\n", p - s);
}
void check2(char *s) {
printf("%s %ld\n", s, strlen(s));
char *p = s;
do {
p++;
} while (*p);
printf ("%ld\n", p - s);
}
int main(int argc, char **argv) {
check1(argv[0]);
check2(argv[0]);
return 0;
}
https://github.com/llvm/llvm-project/pull/108985
More information about the llvm-commits
mailing list