[cfe-dev] Enhancing the DataFlowSolver in clang

Fri Jul 23 11:16:35 PDT 2010

On Jul 21, 2010, at 1:28 AM, Simone Pellegrini wrote:

> On 07/20/2010 08:56 PM, Ted Kremenek wrote:
>> On Jul 20, 2010, at 7:52 AM, Simone Pellegrini wrote:
>> 
>>   
>>> I run make tests from the root clang folder and none of the existing tests have been broken.
>>> 
>>> In attachment the patch file with the fix to the DataFlowSolver.h
>>>     
>> I think this patch is mostly there, but the following assumes that ValTy is a bitvector:
>> 
>> +		// if there the block value has been initialized, and this is the first
>> +		// iteration, use the initialized value
>> +		if(noEdges and !D.getBlockDataMap()[B].sizesEqual(ValTy()))
>> +			Merge(V, D.getBlockDataMap()[B]);
>> 
>> 'sizesEqual' should not appear anywhere in DataSolver.h.  The solver needs to be generic to not assume specific details of ValTy.
>>   
> Hello again,
> I removed the sizesEqual check. Now, I just search for the block B in the map, if there is no entry in the BlockDataMap it means the user didn't provide any initialization. In this way we didn't assume any specific detail of ValTy.
>> A performance nit: here 'getBlockDataMap()' is called twice.  That's wasteful as it requires two lookups in the hashtable instead of one.
>>   
> I also optimized the code by reducing the number of lookups to 1.
>> Also, please use spaces instead of tabs.
> ok.
> 
> cheers, Simone
> <DataFlowSolver.patch>

Hi Simone,

This is definitely a better solution, but this actually still does two hash table lookups.  The 'insert' causes a second lookup in the DenseMap.  That said, I think this is reasonable to go into the tree as is.  The patch still contains tabs, but I've gone and corrected that.  I've committed this patch in r109243.

If we want to remove the second lookup, consider that DataFlowSolver is based on generic programming.  This means that we just need to expect that there is something generic that we expect in all ValTys:

ValTy &blockV = D.getBlockDataMap()[B];
bool isInitialized = blockV.isInitialized();
// If no edges have been found, it means this is the first time the solver 
// has been called on block B, we copy the initialization values (if any)
// as current value for V (which will be used as edge data)
if(noEdges && isInitialized) 
  Merge(V, blockV);
...

and then add 'isInitialized' to the specific ValTy class.

Looking at this solver again, I think this would be better done with a trait class.  For example, instead of adding 'isInitialized()' we could have:

  bool isInitialized = dataflow::ValTrait<ValTy>::isInitialized(blockV);

Unfortunately, it looks like DataflowSolver (which I wrote a while ago)  expects that ValTy has a 'copyValues' method (and others), so this same design issue exists in multiple places.  This requires us to always define new classes for ValTy just for use in the DataflowSolver, and prohibits us from using basic types such as "int" or even pointers.  A better approach would be to use a trait, e.g.: dataflow::ValTrait<ValTy>::copyValues(...,...).  This way we could define these operations without requiring special classes for ValTy.  This would be a nice cleanup.

Thanks for the patch!

Cheers,
Ted