[PATCH] D20541: [esan|cfrag] Create the cfrag variable for the runtime

Qin Zhao via llvm-commits llvm-commits at lists.llvm.org
Thu May 26 10:04:06 PDT 2016


zhaoqin added a comment.

> I don't think we're going full circle. But sorry if I'm hard to understand. Sometimes I might not explain what I would like to have in the best way. Feel free to ask for more details if what I'm asking for seems out of place or seems to go against what I (or others) have said before.

> 

> In your original CL you had `createCacheFragGV` which would:

> 

>   create a struct type

>   create the string global for the module

>   create the constant int and the constant expr to put in the struct

>   create the struct itself

>   create the GV

>    

> 

> This is to create the GV argument, which tools need for getting information from the ESan pass, basically.

> 

> Out of those, some are not needed (create the struct type. We can use an anonymous struct), some are common to any tool that needs an extra argument (create the struct itself, create the GV), and others are, for now, common to every tool that gets an argument (create the string global for the module). The rest is tool-specific, and is about creating the values to put in the struct.


IMHO, if you know a struct type layout, specify it upfront, i.e., create the struct type, and fill in the entries.
Because that's a good way to check that what you fill in is what you specified. I made several mistakes while implementing it, and thanks to the specified struct type I can easily spot the problem.

> My previous comment was about extracting away the struct + GV creation (and not needing to have a struct type).

>  Those two steps are what I pointed to from clang's source: every check will create an `ArrayRef` with the data to be put on the struct, and then common code simply takes that and creates an anonymous structure.


The EmitCheck has a much more work than simply creating a GV, so it makes sense to having a separate place creating the StaticData and then call EmitCheck.
Here, we have one statement creating the GV, it does not make any sense to have a separate function for that.

> Your change addressed those two points, and thanks for doing that. But the resulting code, for me, seems to be more confusing than it needs to be. One of the very good things about the clang example I posted was that you just need to look at the temporary array to figure out what exactly the structure that is passed to `__esan_init` contains. This also makes it easy to add/change fields to that structure (simply add a new `llvm::Constant *[]` element to the array creation).


It is easy to add/change fields, but also make it easy to make mistakes, because there is no struct type to check against.
Even we want to change the struct with a specified struct type, I do not see how hard to change the struct type, esp if it is at the beginning of the same function.
Basically, I feel having a specified struct type upfront has great benefit of error checking, and little downside extra step of changing the struct layout.

> My opinion is that it's much easier to create the array at a single place than to track down the `SmallVector`'s inserted elements.

> 

> > > How about:

> 

> > 

> 

> > >  The createEsanInitToolGV just dispatches to the appropriate (tool-specific) function and returns the same as that function (possibly creates the module string and passes it to the next function too, if you want to factor that out to here);

> 

> > 

> 

> > >  The tool-specific function:

> 

> > 

> 

> > > 

> 

> > 

> 

> > >   Gets all the info/adds constants

> 

> > 

> 

> > >   Sets up an array with the information to be placed in the structure

> 

> > 

> 

> > >   Calls into a helper "create a GV for tool init" function (optional if you want to return the GV from the tool-specific function, at the expense of code duplication)

> 

> > 

> 

> > >    

> 

> > 

> 

> > > 

> 

> > 

> 

> > > That way, each function deals only with one thing. Instead of having dispatch + (some) setup + GV creation all mixed in the same function. The creation of the structure is very easy to read, and you can easily make sure both places (here and compiler-rt) have a matching structure definition.

> 

> > 

> 

> > >  Adding a tool-specific function will also avoid introducing more bugs, since you'd only need to touch the dispatcher + your own tool's struct creator. The dispatcher wouldn't need to know anything about how the GV is created.

> 

> > 

> 

> > > 

> 

> > 

> 

> > > What do you think?

> 

> > 

> 

> > 

> 

> > I am not sure what's the difference from my original CL, where the struct is declared in each tool-specific create GV?

> 

> >  The original CL had a static struct declaration and then filled the entries and created the GV at the end of the function.

> 

> > 

> 

> >   StructType *CacheFragType =

> 

> >     StructType::get(Int32Ty, IntptrTy, nullptr);

> 

> >   ...

> 

> >   GlobalVariable *CacheFragGV = new GlobalVariable(

> 

> >       M, CacheFragType, true, GlobalVariable::InternalLinkage,

> 

> >       ConstantStruct::get(CacheFragType,

> 

> >                           ConstantInt::get(Int32Ty, Options.ToolType),

> 

> >                           ConstantExpr::getPointerCast(UnitName, IntptrTy),

> 

> >                           nullptr));

> 

> > 

> 

> > It seems clearer than pushing a vector in a separate the function and then create the GV without knowing what's inside.

> 

> > 

> 

> > Also, how do you guarantee the UnitName would be the first entry if each tool-specific function "Gets all the info/adds constants"?

> 

> 

> The rationale is that each tool knows what things to place in its structure. If it wants to get the `UnitName` and wants it to be the first thing in the structure, it will place it there.


The only reason I change it to Vec is to make sure we can insert UnitName upfront in CreateEsanInitToolGV, and extend the struct in tool specific createGV functions.

In summary:

1. each tool itself is responsible for the struct layout, so no vec is necessary.
2. having a struct type specified helps check the correctness of creating the struct data.
3. creating the GV is a single statement, not worth separting out.

Looks like my original CL was good.


http://reviews.llvm.org/D20541





More information about the llvm-commits mailing list