[cfe-dev] Pointer Arithmetic

Thu Jul 12 17:06:58 PDT 2007

On Jul 12, 2007, at 4:49 PM, Keith Bauer wrote:

> OK, a new patch, with apparently-working (pointer - pointer) too.
>
> I've made getSize virtual in clang::Type (default implementation
> asserts, and it should become pure virtual when it's implemented for
> all the subclasses) and renamed the conflicting getSize in
> clang::ArrayType to getSizeExpr.
>
> I've added a getPointerDiffType alongside getSizeType, and updated
> SemaExpr to return that type for (pointer - pointer).
>
> There's a test case which checks that errors are generated or not as
> appropriate in all cases (tests/Parser/pointer-arithmetic.c).
>
> Again, criticism is welcome ;)

Cool, this is looking really good.  Minor (pedantic) comments:

+  /// the number of bits to represent the type.
+  // currently only implemented for BuiltinType -- the base  
implementation
+  // asserts.  Should become pure virtual in future.
+  virtual unsigned getSize() const;

The nearby code is a bit inconsistent, but please format the comment  
like this:

+  /// getSize - the number of bits to represent the type.
+  virtual unsigned getSize() const;

In addition, please make it non-virtual, implemented in Type.cpp.   
For type predicates, we tend to implement them in terms of switch  
statements on the canonical type (see the other predicates for  
examples).  We prefer non-virtual methods because we have vague plans  
to eliminate the vtable pointer at some point, and this keeps related  
code together in the same method.

 From a philosophical software engineering perspective, virtual  
methods are great when you have a class hierarchy that often  
changes.  The C type system doesn't often change, so we're ok :)

+  // On Darwin, ptrdiff_t is defined as a "int". This seems silly,  
so I'm
+  // using "long" instead...
+  // FIXME: should derive from "Target".
+  return LongTy;

Please use int :)

  bool Type::isConstantSizeType(SourceLocation *loc) const {
    if (const ArrayType *Ary = dyn_cast<ArrayType>(CanonicalType)) {
-    assert(Ary->getSize() && "Incomplete types don't have a size at  
all!");
-    return Ary->getSize()->isIntegerConstantExpr(loc); // Variable  
Length Array?
+    assert(Ary->getSizeExpr() && "Incomplete types don't have a size  
at all!");
+    return Ary->getSizeExpr()->isIntegerConstantExpr(loc); //  
Variable Length Array?
    }

Please keep the code in 80 columns.  I'm not a stickler about  
testcases fitting in 80 columns, but the C++ code should.

    RValue EmitDiv(RValue LHS, RValue RHS, QualType EltTy);
    RValue EmitRem(RValue LHS, RValue RHS, QualType EltTy);
    RValue EmitAdd(RValue LHS, RValue RHS, QualType EltTy);
+  RValue EmitPointerAdd(RValue LHS, QualType LHSTy, RValue RHS,  
QualType RHSTy, QualType EltTy);
    RValue EmitSub(RValue LHS, RValue RHS, QualType EltTy);
+  RValue EmitPointerSub(RValue LHS, QualType LHSTy, RValue RHS,  
QualType RHSTy, QualType EltTy);
    RValue EmitShl(RValue LHS, RValue RHS, QualType ResTy);

likewise.

+  case BinaryOperator::Add: {
+    QualType ExprTy = E->getType();
+    if (isa<PointerType>(ExprTy)) {
+      Expr *LHSExpr = E->getLHS();

This won't work correctly if you're adding to a typedef for a  
pointer.  Because of the way we handle typedefs, you should always  
strip off typedef information, by using ExprTy.getCanonicalType(),  
like this:

+  case BinaryOperator::Add: {
+    QualType ExprTy = E->getType();
+    if (isa<PointerType>(ExprTy.getCanonicalType())) {
+      Expr *LHSExpr = E->getLHS();

You can think of this as "looking through" any typedefs if present to  
find the actual underlying type.

Alternatively, there are some predicates on type that do this for  
you, which are even better to use when suitable.  This would make the  
code be:

+  case BinaryOperator::Add: {
+    QualType ExprTy = E->getType();
+    if (ExprTy->isPointerType()) {
+      Expr *LHSExpr = E->getLHS();

+  case BinaryOperator::Sub: {
+    QualType ExprTy = E->getType();
+    Expr *LHSExpr = E->getLHS();
+    if (isa<PointerType>(LHSExpr->getType())) {

+RValue CodeGenFunction::EmitPointerAdd(RValue LHS, QualType LHSTy,  
RValue RHS, QualType RHSTy, QualType ResTy) {
+  llvm::Value *LHSValue = LHS.getVal();
+  llvm::Value *RHSValue = RHS.getVal();
+  if (isa<PointerType>(LHSTy.getTypePtr())) {

likewise, these should check Ty->isPointerType().  In the second  
case, you don't need to call getTypePtr().

+RValue CodeGenFunction::EmitPointerAdd(RValue LHS, QualType LHSTy,  
RValue RHS, QualType RHSTy, QualType ResTy) {
+RValue CodeGenFunction::EmitPointerSub(RValue LHS, QualType LHSTy,  
RValue RHS, QualType RHSTy, QualType ResTy) {
...

please keep these in 80 cols.

For this case:

+RValue CodeGenFunction::EmitPointerSub(RValue LHS, QualType LHSTy,  
RValue RHS, QualType RHSTy, QualType ResTy) {
+  llvm::Value *LHSValue = LHS.getVal();
+  llvm::Value *RHSValue = RHS.getVal();
+  const PointerType *RHSPtrType = dyn_cast<PointerType> 
(RHSTy.getTypePtr());
+  if (RHSPtrType != NULL) {

I suggest writing this as:

+  if (const PointerType *RHSPtrType = dyn_cast<PointerType> 
(RHSTy.getQualifiedType())) {

+    // do we care about non-8-bit bytes?  Why is getSize returning  
bits?

1) no.  2) so we have a consistent measure of size, which works for  
bitfields and other things.  Dividing by 8 here is good, but please  
update the comment.

+    // is signed division correct?

Yep :)   For example:

int A[10];
int *B = &A[0], *C = &A[4];

C-B   ->  4    B-C    -> -4

Please update the patch and resend it, it's very close.  Thanks!

-Chris