[llvm] 3f3018b - [SimplifyLibCalls] Pre-commit test case showing bug with wide char support

Bjorn Pettersson via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 7 06:29:51 PDT 2022


Author: Bjorn Pettersson
Date: 2022-10-07T15:29:31+02:00
New Revision: 3f3018b602c2650358c9ff26657b2d8ba7da7845

URL: https://github.com/llvm/llvm-project/commit/3f3018b602c2650358c9ff26657b2d8ba7da7845
DIFF: https://github.com/llvm/llvm-project/commit/3f3018b602c2650358c9ff26657b2d8ba7da7845.diff

LOG: [SimplifyLibCalls] Pre-commit test case showing bug with wide char support

The ValueTracking support for getting the string length of a wchar_t
string (e.g. using wcslen) seem to be having some bugs.

Problem I've seen is that llvm::getConstantDataArrayInfo is taking
both a "ElementSize" argument (basically indicating size of a
char/element in bits) and an "Offset" which afaict is an offset
in the unit "number of elements". Then it also use
stripAndAccumulateConstantOffsets to get a "StartIdx" which afaict
is calculated in bytes. The returned Slice.Length is based on
arithmetics that add/subtract variables that are having different
units (bytes vs elements). Most notably I think the "StartIdx" must
be scaled using the "ElementSize" to get correct results.

This patch just adds a new test case showing that we get a wrong
result when doing wcslen(x + c). The actual fix to the above problem
will be done in a follow up commit.

Differential Revision: https://reviews.llvm.org/D135262

Added: 
    

Modified: 
    llvm/test/Transforms/InstCombine/wcslen-1.ll

Removed: 
    


################################################################################
diff  --git a/llvm/test/Transforms/InstCombine/wcslen-1.ll b/llvm/test/Transforms/InstCombine/wcslen-1.ll
index 3af60a1cd572..a18798970b1f 100644
--- a/llvm/test/Transforms/InstCombine/wcslen-1.ll
+++ b/llvm/test/Transforms/InstCombine/wcslen-1.ll
@@ -214,4 +214,45 @@ define i64 @test_simplify12() {
   ret i64 %l
 }
 
+ at ws = constant [10 x i32] [i32 9, i32 8, i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0]
+
+; Fold wcslen(ws + 2) => 7.
+; FIXME: This fold is faulty, result should be 7 not 1.
+define i64 @fold_wcslen_1() {
+; CHECK-LABEL: @fold_wcslen_1(
+; CHECK-NEXT:    ret i64 1
+;
+  %p = getelementptr inbounds [10 x i32], ptr @ws, i64 0, i64 2
+  %len = tail call i64 @wcslen(ptr %p)
+  ret i64 %len
+}
+
+; Should not crash on this, and no optimization expected (idea is to get into
+; llvm::getConstantDataArrayInfo looking for an array with 32-bit elements but
+; with an offset that isn't a multiple of the element size).  FIXME: Looks a
+; bit weird. Don't think we should return 6 here.
+define i64 @no_fold_wcslen_1() {
+; CHECK-LABEL: @no_fold_wcslen_1(
+; CHECK-NEXT:    ret i64 6
+;
+  %p = getelementptr [15 x i8], ptr @ws, i64 0, i64 3
+  %len = tail call i64 @wcslen(ptr %p)
+  ret i64 %len
+}
+
+ at s8 = constant [10 x i8] [i8 9, i8 8, i8 7, i8 6, i8 5, i8 4, i8 3, i8 2, i8 1, i8 0]
+
+; Should not crash on this, and no optimization expected (idea is to get into
+; llvm::getConstantDataArrayInfo looking for an array with 32-bit elements but
+; with an offset that isn't a multiple of the element size).
+define i64 @no_fold_wcslen_2() {
+; CHECK-LABEL: @no_fold_wcslen_2(
+; CHECK-NEXT:    %len = tail call i64 @wcslen(ptr nonnull getelementptr inbounds ([10 x i8], ptr @s8, i64 0, i64 3))
+; CHECK-NEXT:    ret i64 %len
+;
+  %p = getelementptr [10 x i8], ptr @s8, i64 0, i64 3
+  %len = tail call i64 @wcslen(ptr %p)
+  ret i64 %len
+}
+
 attributes #0 = { null_pointer_is_valid }


        


More information about the llvm-commits mailing list