[PATCH] D12793: Three new security overflow builtins with generic argument types

Wed Sep 23 09:34:01 PDT 2015

DavidEGrayson added a comment.

Hi John.  I thought about your comment on line 1601 somewhat carefully and tried to imagine what it would look like.  See my inline comment.

Also, by the way, do you know if the results of the LLVM intrinsics like `llvm.smul.with.overflow.*` are guaranteed to be well-defined when there is an overflow?  This whole patch hinges on that, because the results of the builtins are supposed to be well-defined.  The documentation here <http://llvm.org/docs/LangRef.html#llvm-smul-with-overflow-intrinsics> is not so great.

================
Comment at: lib/CodeGen/CGBuiltin.cpp:1601
@@ +1600,3 @@
+    auto RITy = IntegerWidthAndSignedness(CGM.getContext(), RQTy);
+    auto EITy = EncompassingIntegerType({XITy, YITy, RITy});
+
----------------
rjmccall wrote:
> DavidEGrayson wrote:
> > rjmccall wrote:
> > > These are not the most evocative names you could have chosen. :)
> > > 
> > > Also, this strategy is correct, but it produces horrible code when the operands have the same signedness and the result is different, where the result would otherwise be an ordinary operation and a compare.  You really want to just merge the two operands and then max with the width of the result type.
> > I don't think that would work.  It sounds like you want me to remove RITy from the call to EncompassingIntegerType.  That means the ecompassing type could be smaller than the result type, and the arithmetic intrinsic would report an overflow even though the mathematical result is representable in the result type.
> > 
> > For example, if I am multiplying ((uint8_t)100) by ((uint8_t)100) and storing the result in a (uint16_t), the result needs to be 10000 and there is no overflow.  To arrive at that result, we need to convert the operands to 16-bit unsigned integers before multiplying them.  If we multiply them as 8-bit unsigned integers, the multiplication intrinsic will report an overflow and give a result of 16.  That seems like a messy situation and I don't see how a max operation would fix it.
> I think I was unclear.  I'm not saying that you should do a max as part of computing the dynamic result.  I'm saying that, when deciding the encompassing type, it should still be at least as wide as the result, but you should ignore the result type's signedness and then patch it up with a comparison at the end if necessary.  (I believe that in both cases, it's a signed comparison against zero.)
> 
> That is, when multiplying two uint8_t's and storing the result in an int16_t, we should just do an unsigned 16-bit multiply-with-overflow.  The computation overflows if the multiply overflows (which it provably doesn't, but we can let LLVM figure that out for now) or the result is signed-< 0.
OK, I understand now.  As an optimization, you would to like the operation type (the width/signedness that we use to perform the main math operation) to sometimes be the same width as the result type (the type that the third argument points to) but have a different sign.

I'm thinking about how that would work in all the different cases.  I have concluded that it would work in 5 out of 6 cases, but we would have to think carefully about each case and it is possible I have made some mistakes.  But here are the 6 cases:

1) `__builtin_add_overflow`, operation type signed, result type unsigned: It would work but it would be complicated.  If the signed addition intrinsic reports an overflow, we need to know whether the result was too big (which is OK) or too small (which is bad).  The most extreme underflow would be (-2^(N-1)) + (-2^(N-1)) = 0, so every underflow should give a non-negative result.  Similarly, every overflow-proper (case where the result was too big) would result in a negative number.  So the overflow bit returned from `__builtin_add_overflow` should be the overflow bit from the intrinsic XORed with a "less than zero" comparison.

2) `__builtin_add_overflow`, operation type unsigned, result type signed:  This is easier, because the only possible overflow reported by the unsigned addition intrinsic happens when the result is too big, which means it's definitely too big for the signed type.  The overflow bit returned from `__builtin_add_overflow` in this case would be the overflow bit from the intrinsic ORed with a "less than zero" comparison.

3) `__builtin_sub_overflow`, operation type signed, result type unsigned:  If the signed subtraction intrinsic reports an overflow, we need to know whether the result was too big (which is OK) or too small (which is bad).  The most extreme underflow would be (-2^(N-1)) - (2^(N-1)-1) = 1, so every underflow gives a positive value.  The most extreme overflow-proper would be (2^(N-1)-1) - (-2^(N-1)) = -1, so every overflow-proper gives a negative number.  Just like case #1, the overflow bit we return shoud be the overflow bit from the intrinsic XORed with a "less than zero" comparison.

4) `__builtin_sub_overflow`, operation type unsigned, result type signed:  If the unsigned subtraction intrinsic reports an overflow, it simply means Y > X.  That is OK as long as Y isn't too much greater than X.  The most extreme case is 0 - (2^N-1) = 1.  So if the result from the intrinsic is less than zero, then we are OK.  So the overflow bit returned from __builtin_sub_overflow should be the overflow bit from the intrinsic XORed with a "less than zero" comparison. 

5) `__builtin_mul_overflow`, operation type signed, result type unsigned: I don't think this case would work, so we would have to fall back to using an operation type that actually encompasses the result type.  For example, if we are multiplying two int16_t and the result is uint16_t, and the intrinsic gives us an overflow, it's really hard to know whether that overflow was OK (200 * 200) or not OK (200 * 527).

6) `__builtin_mul_overflow`, operation type unsigned, result type signed: The logic from case 2 applies here.

Do you want the patch to go in this direction?  Since we have to consider these 6 cases, I think it would be more complicated than what you were describing.

Repository:
  rL LLVM

http://reviews.llvm.org/D12793