[cfe-dev] Bug in clang parsing pointer to an array in function parameter list

Justin Handville uniheliodem at gmail.com
Sun Sep 9 17:40:36 PDT 2007


Well, I'll defer to you and Chris on this.  Let me know where to make
the change, and I will be more than happy to make it.  Would this also
be a good time to address the FIXME at the bottom of the comment in
Sema::ParseParamDeclarator corresponding to this?

On 9/9/07, Steve Naroff <snaroff at apple.com> wrote:
>
> On Sep 9, 2007, at 4:32 PM, Justin Handville wrote:
>
> > Okay.  I found the section in Sema::ParseParamDeclarator where the bug
> > is occurring.  Apparently, this is where the fix needs to go.  I'll
> > work on analyzing why this code is failing, put together some test
> > cases, and submit the patch at the appropriate place.  I apologize,
> > I'm still learning the code.
> >
>
> No need to apologize. Here is some more data that might help...
>
>  From my perspective, it's not clear the fix can be isolated to
> Sema::ParseParamDeclarator().
>
> First, let's be clear on the bug. The bug is the functions type is
> not in sync with it's ParmVarDecl AST node.
>
> The function type is computed *before* calling
> Sema::ParseParamDeclarator(), which does the ParmVarDecl conversion.
> As a result, we could either...
>
> (a) Move the argument conversion "up" to Sema::GetTypeForDeclarator
> (). If we did this, the conversion code could be removed from
> Sema::ParseParamDeclarator().
> (b) Have Sema::ParseParamDeclarator() some how change the functions
> type when it does the conversion for ParmVarDecl (I don't like this,
> put want to mention it for completeness).
> (c) See if it is possible for the code generator to rely on the
> ParmVarDecls, *not* the function type. Chris and I thought long and
> hard about this particular ParmVarDecl conversion (hence the large
> comment in the code). That said, this may well be a code gen bug. If
> not, then we will need to consider a front-end fix.
>
> snaroff
>
> Breakpoint 1, clang::Sema::GetTypeForDeclarator (this=0x2605900,
> D=@0xbffff2a0, S=0x2605a50) at SemaType.cpp:248
> 248           if (!FTI.hasPrototype) {
> (gdb) c
> Continuing.
> int (foo)(int, char *[])
>
> Breakpoint 2, clang::Sema::ParseParamDeclarator (this=0x2605900,
> FTI=@0xbffff2c0, ArgNo=0, FnScope=0x2605b20) at SemaDecl.cpp:688
> 688       QualType parmDeclType = QualType::getFromOpaquePtr
> (PI.TypeInfo);
>
> > On 9/9/07, Justin Handville <uniheliodem at gmail.com> wrote:
> >> Also, note that the issue is with the initial store of the argv
> >> argument generated by the code generator, which isn't shown in the
> >> parse-ast-dump call.
> >>
> >> I'm still not familiar with the code base, so I'll need a little
> >> guidance with where this change belongs, but the bug definitely
> >> exists.
> >>
> >> On 9/9/07, Justin Handville <uniheliodem at gmail.com> wrote:
> >>> Ah.  Then I'll check in the front end.  I thought that the char*
> >>> [] in
> >>> the parameter list looked a little funny, but I thought maybe that
> >>> might be intended.
> >>>
> >>> Without the patch, the code asserts, since it is trying to store a
> >>> char*[] into a char** location.  It wouldn't be difficult to add the
> >>> check to the appropriate place in the front end to change the
> >>> argument
> >>> declaration to a char**, as llvm-gcc does.
> >>>
> >>> On 9/9/07, Steve Naroff <snaroff at apple.com> wrote:
> >>>>
> >>>> On Sep 9, 2007, at 3:32 PM, Justin Handville wrote:
> >>>>
> >>>>> With the following patch, I can now compile and execute this
> >>>>> source
> >>>>> file:
> >>>>>
> >>>>> int puts(char*);
> >>>>>
> >>>>> int main(int argc, char* argv[])
> >>>>> {
> >>>>>   char* foo = argv[0];
> >>>>>
> >>>>>   puts(foo);
> >>>>>
> >>>>>   return 0;
> >>>>> }
> >>>>>
> >>>>
> >>>> Justin,
> >>>>
> >>>> I'm a bit confused. The front-end (Sema::ParseParamDeclarator)
> >>>> already converts argv from "char *[]" to "char **".
> >>>>
> >>>> See the "DeclRefExpr" node below (on line 6).
> >>>>
> >>>> I don't know why the code generator would need to fiddle with this.
> >>>>
> >>>> Chris will need to advise...
> >>>>
> >>>> snaroff
> >>>>
> >>>> [dylan:~/llvm/tools/clang] admin% ../../Debug/bin/clang justin.c -
> >>>> parse-ast-dump
> >>>>
> >>>> int puts(char *);
> >>>>
> >>>> int main(int argc, char *argv[])
> >>>> (CompoundStmt 0x3105ed0 <justin.c:5:1, line:11:1>
> >>>>    (DeclStmt 0x3105bf0 <:0:0>
> >>>>      0x3105dd0 "char *foo =
> >>>>        (ArraySubscriptExpr 0x3105db0 <justin.c:6:15, col:21>
> >>>> 'char *'
> >>>>          (DeclRefExpr 0x3105d70 <col:15> 'char **' Decl='argv'
> >>>> 0x3105d40)
> >>>>          (IntegerLiteral 0x3105d90 <col:20> 'int' 0))")
> >>>>    (CallExpr 0x3105e70 <line:8:3, col:11> 'int'
> >>>>      (ImplicitCastExpr 0x3105e60 <col:3> 'int (*)(char *)'
> >>>>        (DeclRefExpr 0x3105e00 <col:3> 'int (char *)' Decl='puts'
> >>>> 0x3105c20))
> >>>>      (DeclRefExpr 0x3105e20 <col:8> 'char *' Decl='foo' 0x3105dd0))
> >>>>    (ReturnStmt 0x3105ec0 <line:10:3, col:10>
> >>>>      (IntegerLiteral 0x3105ea0 <col:10> 'int' 0)))
> >>>>
> >>>>
> >>>>> Here's the patch.  I'm curious to see if this will work in all
> >>>>> cases.
> >>>>> I am still adding test cases to the build, so it is a bit
> >>>>> experimental:
> >>>>>
> >>>>> ndex: CodeGen/CodeGenFunction.h
> >>>>> ==================================================================
> >>>>> =
> >>>>> --- CodeGen/CodeGenFunction.h   (revision 41790)
> >>>>> +++ CodeGen/CodeGenFunction.h   (working copy)
> >>>>> @@ -237,6 +237,11 @@
> >>>>>    };
> >>>>>    llvm::SmallVector<BreakContinue, 8> BreakContinueStack;
> >>>>>
> >>>>> +  static bool
> >>>>> +  isCompatibleDeclPointerType(
> >>>>> +    const llvm::Type* Arg,
> >>>>> +    const llvm::Type* DeclPtr);
> >>>>> +
> >>>>>  public:
> >>>>>    CodeGenFunction(CodeGenModule &cgm);
> >>>>>
> >>>>> Index: CodeGen/CGDecl.cpp
> >>>>> ==================================================================
> >>>>> =
> >>>>> --- CodeGen/CGDecl.cpp  (revision 41790)
> >>>>> +++ CodeGen/CGDecl.cpp  (working copy)
> >>>>> @@ -96,11 +96,69 @@
> >>>>>    }
> >>>>>  }
> >>>>>
> >>>>> +bool
> >>>>> +CodeGenFunction::isCompatibleDeclPointerType(
> >>>>> +  const llvm::Type* ArgType,
> >>>>> +  const llvm::Type* DeclType)
> >>>>> +{
> >>>>> +  if(
> >>>>> +      ArgType->getTypeID() != llvm::Type::PointerTyID
> >>>>> +   || DeclType->getTypeID() != llvm::Type::PointerTyID)
> >>>>> +  {
> >>>>> +    return false;
> >>>>> +  }
> >>>>> +
> >>>>> +  while(
> >>>>> +      ArgType->getNumContainedTypes() == 1
> >>>>> +   && DeclType->getNumContainedTypes() == 1)
> >>>>> +  {
> >>>>> +    const llvm::Type* ArgContainedType = ArgType-
> >>>>> >getContainedType
> >>>>> (0);
> >>>>> +    const llvm::Type* DeclContainedType = DeclType-
> >>>>>> getContainedType(0);
> >>>>> +
> >>>>> +    //if the two contained types point to the same
> >>>>> +    //location, then these pointer types are compatible
> >>>>> +    if(
> >>>>> +      ArgContainedType == DeclContainedType)
> >>>>> +    {
> >>>>> +      return true;
> >>>>> +    }
> >>>>> +    //is the pointer indirection level different?
> >>>>> +    else if(
> >>>>> +      ArgContainedType->getNumContainedTypes() !=
> >>>>> +        DeclContainedType->getNumContainedTypes())
> >>>>> +    {
> >>>>> +      break;
> >>>>> +    }
> >>>>> +    //are these compatible pointer / array types?
> >>>>> +    else if(
> >>>>> +        llvm::SequentialType::classof(ArgContainedType)
> >>>>> +     && llvm::SequentialType::classof(DeclContainedType))
> >>>>> +    {
> >>>>> +      if( ArgContainedType->getNumContainedTypes() != 1 )
> >>>>> +      {
> >>>>> +        //FIXME will this ever happen with arrays / pointers?
> >>>>> +        break;
> >>>>> +      }
> >>>>> +
> >>>>> +      ArgType = ArgContainedType;
> >>>>> +      DeclType = DeclContainedType;
> >>>>> +    }
> >>>>> +    else
> >>>>> +    {
> >>>>> +      //one or more of these is something other than an array /
> >>>>> pointer
> >>>>> +      break;
> >>>>> +    }
> >>>>> +  }
> >>>>> +
> >>>>> +  return false;
> >>>>> +}
> >>>>> +
> >>>>>  /// Emit an alloca for the specified parameter and set up
> >>>>> LocalDeclMap.
> >>>>>  void CodeGenFunction::EmitParmDecl(const ParmVarDecl &D,
> >>>>> llvm::Value *Arg) {
> >>>>>    QualType Ty = D.getCanonicalType();
> >>>>>
> >>>>>    llvm::Value *DeclPtr;
> >>>>> +
> >>>>>    if (!Ty->isConstantSizeType(getContext())) {
> >>>>>      // Variable sized values always are passed by-reference.
> >>>>>      DeclPtr = Arg;
> >>>>> @@ -111,7 +169,19 @@
> >>>>>        // TODO: Alignment
> >>>>>        DeclPtr = new llvm::AllocaInst(LTy, 0, std::string
> >>>>> (D.getName
> >>>>> ())+".addr",
> >>>>>                                       AllocaInsertPt);
> >>>>> -
> >>>>> +
> >>>>> +      //do we need to perform an implicit conversion?
> >>>>> +      if( isCompatibleDeclPointerType(Arg->getType(), DeclPtr-
> >>>>>> getType()) )
> >>>>> +      {
> >>>>> +        llvm::Value* CastPtr =
> >>>>> +          Builder.CreateBitCast(
> >>>>> +            Arg,
> >>>>> +            LTy,
> >>>>> +            (std::string(D.getName()) + ".implconv").c_str());
> >>>>> +
> >>>>> +        Arg = CastPtr;
> >>>>> +      }
> >>>>> +
> >>>>>        // Store the initial value into the alloca.
> >>>>>        Builder.CreateStore(Arg, DeclPtr);
> >>>>>      } else {
> >>>>>
> >>>>> On 9/9/07, Justin Handville <uniheliodem at gmail.com> wrote:
> >>>>>> Perhaps the better question to ask is where should this transform
> >>>>>> occur?
> >>>>>>
> >>>>>> On 9/9/07, Justin Handville <uniheliodem at gmail.com> wrote:
> >>>>>>> The following code snippet generates an assert when compiled
> >>>>>>> with
> >>>>>>> clang -emit-llvm
> >>>>>>>
> >>>>>>> int main(int argc, char* argv[])
> >>>>>>> {
> >>>>>>>   return 0;
> >>>>>>> }
> >>>>>>>
> >>>>>>> Assert:
> >>>>>>>
> >>>>>>> Assertion failed: (getOperand(0)->getType() ==
> >>>>>>> cast<PointerType>(getOperand(1)->getType())->getElementType() &&
> >>>>>>> "Ptr
> >>>>>>> must be a pointer to Val type!"), function AssertOK, file
> >>>>>>> Instructions.cpp, line 796.
> >>>>>>>
> >>>>>>> The problem here is at line 116 of CGDecl.cpp:
> >>>>>>>
> >>>>>>>     if (LTy->isFirstClassType()) {
> >>>>>>>       // TODO: Alignment
> >>>>>>>       DeclPtr = new llvm::AllocaInst(LTy, 0, std::string
> >>>>>>> (D.getName
> >>>>>>> ())+".addr",
> >>>>>>>                                      AllocaInsertPt);
> >>>>>>>
> >>>>>>>       // Store the initial value into the alloca.
> >>>>>>>       Builder.CreateStore(Arg, DeclPtr);  //attempting to store
> >>>>>>> char*[] as char**
> >>>>>>>
> >>>>>>> The problem is that llvm treats char*[] and char** as two
> >>>>>>> distinct
> >>>>>>> types as far as I can tell.  EmitParmDecl should convert the
> >>>>>>> type
> >>>>>>> to a
> >>>>>>> char** before calling CreateStore, since this is how the
> >>>>>>> argument
> >>>>>>> would be passed in C ABI.
> >>>>>>>
> >>>>>>> I assume that a transform step should be added before the
> >>>>>>> CreateStore
> >>>>>>> that could normalize the argument to match the declptr.  I
> >>>>>>> assume a
> >>>>>>> good approach would be to descend through both Arg and DeclPtr,
> >>>>>>> transforming Arg to match DeclPtr where compatible, or
> >>>>>>> raising an
> >>>>>>> appropriate diagnostic error if such a transformation is not
> >>>>>>> possible.
> >>>>>>>
> >>>>>>> If I can get some approval that this route would work in this
> >>>>>>> case,
> >>>>>>> then I will go ahead and submit the patch.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Justin
> >>>>>>>
> >>>>>>
> >>>>> _______________________________________________
> >>>>> cfe-dev mailing list
> >>>>> cfe-dev at cs.uiuc.edu
> >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >>>>
> >>>>
> >>>
> >>
>
>



More information about the cfe-dev mailing list