[cfe-dev] source code rewriting for invalid ast nodes

Manuel Klimek klimek at google.com
Thu Oct 9 05:27:27 PDT 2014


I assume you cannot change the code generator?

Why can't you:
1. generate the code; parse it with the current version (having the 'int x')
2. find all 'int x's you wan to change to 'X x;'; also find all uses of
them (including uses in array[x]); output all this information in some
format
3. run over all those cases; now you can change 'int x' to 'X x' and
'array[x]' to 'myfunc(array, x)' at the same time
4. reap benefits; codebase is never in a non-parsing state

Cheers,
/Manuel

On Thu Oct 09 2014 at 2:10:51 PM Marc Greim <marc.greim at mytum.de> wrote:

> In this particular example "array[x]" was generated externally while I
> changed the declaration "int x;" to "X x;". That is the point where i need
> to patch the generated code in order to fix the missing operator[] error
> for type X and allow other simulation relevant operations.
>
> On Thu, Oct 9, 2014 at 2:04 PM, Manuel Klimek <klimek at google.com> wrote:
>
>> On Thu Oct 09 2014 at 1:59:13 PM Marc Greim <marc.greim at mytum.de> wrote:
>>
>>> In general I would also say that doing code transformation should only
>>> be done on valid code since one needs to know what actually happens. This
>>> particular problem is unfortunately rather specific. Sorry if the given
>>> example was not sufficient.
>>>
>>> Maybe the problem becomes clearer when class X is defined as wrapper for
>>> int with special functionality.
>>>
>>> The code that I try to transform is part of a hardware simulation
>>> written/generated in c++. The code is guaranteed to be valid except for the
>>> partial substitution of int variables by X variables. The resulting invalid
>>> code statements are only invalid in the sense that a parameter type is not
>>> right. Due to the complexity of the code it is not feasible/possible to
>>> predict where type X and where int is used for such operations. Since some
>>> operators (e.g. [] for pointers) cannot be overloaded with standard c++
>>> code, errors will come up. For that and other reasons it is necessary to
>>> rewrite the code to use custom functions instead of the operator itself.
>>> This is where my problem with clang lies. Those nodes are removed from the
>>> ast due to missing operator[]/missing type conversion. But those are the
>>> nodes i need to preserve in order to run a matcher and transform the code
>>> as needed for the simulation. Again as noted before adding operator int()
>>> to class X is not a solution since that would create many ambiguity
>>> problems.
>>>
>>>
>>> So to boil the problem further down: How can i forces clang to ignore
>>> wrong types that are passed to operators/functions and build the ast with
>>> such nodes?
>>>
>>> Again, I suspect that adding built-in operators for those cases is the
>>> way to go, but I don't know how to iterate over all types and then create
>>> empty dummy functions for that.
>>>
>>> I hope this describes my problem sufficiently.
>>>
>>
>> I still don't fully understand what the current situation is.
>> So you have code that calls array[x] with a class type x? How did you
>> produce that code?
>>
>>
>>>
>>> On Thu, Oct 9, 2014 at 12:53 PM, Manuel Klimek <klimek at google.com>
>>> wrote:
>>>
>>>> The general advise is to only do code transformations on valid code. I
>>>> don't know enough about your problem to understand why that is not possible.
>>>>
>>>> On Tue Oct 07 2014 at 4:37:52 PM Marc Greim <marc.greim at mytum.de>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I'm using clang to rewrite (generated) source code. Usually that can
>>>>> be done by running matchers on the ast in combination with a rewriter
>>>>> instance.
>>>>>
>>>>> Unfortunately this fails in the case of operator[] where the index
>>>>> argument cannot be converted to a valid type (e.g. class X{} x; int *
>>>>> array; int i = array[x]; ). The corresponding ast nodes are missing since
>>>>> the "array[x]" statement has no valid representation.
>>>>>
>>>>> How can I detect and rewrite the code in cases like above example?
>>>>> "array[x]" -> "someFunc(array,x)"
>>>>>
>>>>> ExternalSemaSource::CorrectTypo dosen't get called in this case (maybe
>>>>> because all tokens are valid?) so that attempt failed.
>>>>>
>>>>> Using ExternalSemaSource::LookupUnqualified also failed so far,
>>>>> because i haven't found a method to get a valid SourceRange for that code
>>>>> part. It would also require manual parsing of that code part which seem
>>>>> like a "dirty" solution to me.
>>>>>
>>>>> The only idea that I have left is to declare builtin operators for any
>>>>> type with Sema::AddBuiltinCandidate, but that may result in many operator
>>>>> definitions. Also I have no idea how to iterate over all types and how to
>>>>> declare these functions. However this may be the best solution, because
>>>>> then matchers can be used to find and rewrite those code parts.
>>>>>
>>>>> I would appreciate any help to find a solution.
>>>>>
>>>>> Regards,
>>>>> Marc
>>>>>
>>>>>
>>>>> P.S. I am aware that adding operator int() to class X of above example
>>>>> would allow those statements but that is not an option, since X cannot be
>>>>> represented as int in my case and additional operations need to be
>>>>> performed; such an operator may also mess up other parts of the code.
>>>>> _______________________________________________
>>>>> cfe-dev mailing list
>>>>> cfe-dev at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>
>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141009/e42ab762/attachment.html>


More information about the cfe-dev mailing list