[PATCH] D18438: Calculate __builtin_object_size when pointer depends on a condition

George Burgess IV via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 24 11:46:17 PDT 2016


george.burgess.iv added a comment.

Hi! Thanks for doing this.


================
Comment at: include/llvm/Analysis/MemoryBuiltins.h:36
@@ +35,3 @@
+enum ObjSizeMode {
+  EXACT = 0,
+  MIN = 1,
----------------
Style nit: This should be CamelCase, not all caps. 

================
Comment at: include/llvm/Analysis/MemoryBuiltins.h:39
@@ -35,1 +38,3 @@
+  MAX = 2
+};
 
----------------
These identifiers seem really general to be putting them into the `llvm` namespace. Can we make this an `enum class` instead? Or can we prefix them all with something like `OSM_`?

================
Comment at: include/llvm/Analysis/MemoryBuiltins.h:137
@@ -131,3 +136,3 @@
 /// If RoundToAlign is true, then Size is rounded up to the aligment of allocas,
 /// byval arguments, and global variables.
 bool getObjectSize(const Value *Ptr, uint64_t &Size, const DataLayout &DL,
----------------
Please document what `Mode` does here.

================
Comment at: lib/Analysis/MemoryBuiltins.cpp:559
@@ -555,3 +558,3 @@
 SizeOffsetType ObjectSizeOffsetVisitor::visitPHINode(PHINode&) {
   // too complex to analyze statically.
   return unknown();
----------------
Do we not need to handle Phis, as well?

Consider:

```
void sideeffect();
int foo(int N) {
  char Buf[100];
  char *Ptr = Buf;
  if (N > 0) {
    sideeffect();
    Ptr = Buf + 50;
  }
  return __builtin_object_size(Ptr, 0);
}
```

...Which turns into:

```
define i32 @foo(i32 %N) {
entry:
  %Buf = alloca [100 x i8], align 16
  %0 = getelementptr inbounds [100 x i8], [100 x i8]* %Buf, i64 0, i64 0
  %cmp = icmp sgt i32 %N, 0
  br i1 %cmp, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  tail call void (...) @sideeffect() #4
  %add.ptr = getelementptr inbounds [100 x i8], [100 x i8]* %Buf, i64 0, i64 50
  br label %if.end

if.end:                                           ; preds = %if.then, %entry
  %Ptr.0 = phi i8* [ %add.ptr, %if.then ], [ %0, %entry ]
  %1 = call i64 @llvm.objectsize.i64.p0i8(i8* %Ptr.0, i1 false)
  %conv = trunc i64 %1 to i32
  ret i32 %conv
}
```

================
Comment at: lib/Analysis/MemoryBuiltins.cpp:567
@@ +566,3 @@
+  if (bothKnown(TrueSide) && bothKnown(FalseSide)) {
+    if (TrueSide == FalseSide) { 
+      return TrueSide;
----------------
This seems like it would fail given:

```
int foo(int N) {
  char Big[20];
  char Small[10];
  char *Ptr = N ? Big + 10 : Small;
  return __bos(Ptr);
}
```

...And it looks like this was a bug in the original implementation, too. Yay bugs. :)

If you'd like to fix it, feel free; otherwise, I'll do it after your patch lands. If you decide to fix it, please add a test-case that doesn't go through CGP.

================
Comment at: lib/Analysis/MemoryBuiltins.cpp:571
@@ +570,3 @@
+    if (Mode == MIN) {
+      return TrueSide.first.getZExtValue() < FalseSide.first.getZExtValue() ?
+             TrueSide : FalseSide;
----------------
Is there a reason we can't just compare `TrueSide.first` and `FalseSide.first` directly?

================
Comment at: lib/Analysis/MemoryBuiltins.cpp:571
@@ +570,3 @@
+    if (Mode == MIN) {
+      return TrueSide.first.getZExtValue() < FalseSide.first.getZExtValue() ?
+             TrueSide : FalseSide;
----------------
george.burgess.iv wrote:
> Is there a reason we can't just compare `TrueSide.first` and `FalseSide.first` directly?
Do we really want to ignore offsets here? It seems that doing would cause this to break on code like:

```
int foo(int N) {
  char Small[10];
  char Large[20];

  char *Ptr = N ? Small : Large + 19;
  return __bos(Ptr);
}
```

================
Comment at: lib/Analysis/MemoryBuiltins.cpp:575
@@ +574,3 @@
+    if (Mode == MAX) {
+      return TrueSide.first.getZExtValue() > FalseSide.first.getZExtValue() ?
+             TrueSide : FalseSide;
----------------
Same comments as the `Mode == MIN` conditional above

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1788
@@ +1787,3 @@
+      if (getObjectSize(II->getArgOperand(0),
+                       Size, *DL, TLInfo, false, Mode)) {
+        RetVal = ConstantInt::get(ReturnTy, Size);
----------------
Style nit: Add another space of indentation please

================
Comment at: lib/CodeGen/CodeGenPrepare.cpp:1792
@@ +1791,3 @@
+        bool Min = (Op1->getZExtValue() == 1);
+        RetVal = ConstantInt::get(ReturnTy, Min ? 0 : -1ULL);
+      }
----------------
Can we just say `Mode == MIN` in the ternary here?

================
Comment at: test/CodeGen/X86/builtin-condition.ll:1
@@ +1,2 @@
+; RUN: llc < %s | FileCheck %s
+
----------------
I recommend using `opt` instead, so we don't have to compare against asm. Given that our test cases should be simple, you can probably get away with changing the run line to:

`; RUN: opt -codegenprepare -S < %s | FileCheck %s`

...And it will output LLVM IR with all of the `objectsize` calls lowered for you :)

================
Comment at: test/CodeGen/X86/builtin-condition.ll:34
@@ +33,3 @@
+
+define i64 @foo(i32 %flag) #0 {
+entry:
----------------
Can we add a test-case for something like

```
int foo(int N) {
  char Small[10];
  char Large[20];

  char *Ptr = N ? Small : Large + 19;
  return __bos(Ptr);
}
```

Too, please?


Repository:
  rL LLVM

http://reviews.llvm.org/D18438





More information about the llvm-commits mailing list