[cfe-dev] AST Writer

Tue Jul 17 10:24:59 PDT 2018

Matthieu, try https://github.com/firolino/clang-tool as getting started and
change the transformer to your needs to insert code/text at a given
location. Hope it helps.

Best,
Firat

Am Di., 17. Juli 2018 um 09:00 Uhr schrieb Matthieu Brucher via cfe-dev <
cfe-dev at lists.llvm.org>:

> Indeed, that's what I'm now aiming at. Unfortunately, it seems that there
> are no examples as how to use FrontEndAction properly with clang 6.0.0. I
> can use libeling with runToolOnCode to generate a module, but the triple is
> not set up properly in that case when I want to use the JIT. And it seems
> to be a problem with clang, as if I do this:
>
>     clang::DiagnosticOptions diagnosticOptions;
>     std::unique_ptr<clang::TextDiagnosticPrinter> textDiagnosticPrinter =
>       std::make_unique<clang::TextDiagnosticPrinter>(llvm::outs(),
>                                                      &diagnosticOptions);
>     llvm::IntrusiveRefCntPtr<clang::DiagnosticIDs> diagIDs;
>
>     std::unique_ptr<clang::DiagnosticsEngine> diagnosticsEngine =
>       std::make_unique<clang::DiagnosticsEngine>(diagIDs, &diagnosticOptions, textDiagnosticPrinter.get());
>
>     clang::LangOptions languageOptions;
>     clang::FileSystemOptions fileSystemOptions;
>     clang::FileManager fileManager(fileSystemOptions);
>     clang::SourceManager sourceManager(*diagnosticsEngine,
>                                        fileManager);
>     std::shared_ptr<clang::HeaderSearchOptions> headerSearchOptions(new clang::HeaderSearchOptions());
>
>     const std::shared_ptr<clang::TargetOptions> targetOptions = std::make_shared<clang::TargetOptions>();
>     targetOptions->Triple = llvm::sys::getDefaultTargetTriple();
>
>     std::unique_ptr<clang::TargetInfo> targetInfo(
>       clang::TargetInfo::CreateTargetInfo(*diagnosticsEngine, targetOptions));
>
>     clang::HeaderSearch headerSearch(headerSearchOptions,
>                                      sourceManager,
>                                      *diagnosticsEngine,
>                                      languageOptions,
>                                      targetInfo.get());
>     clang::MemoryBufferCache PCMCache;
>     clang::CompilerInstance compInst;
>
>     std::shared_ptr<clang::PreprocessorOptions> opts(std::make_shared<clang::PreprocessorOptions>());
>     clang::Preprocessor preprocessor(opts,
>                                      *diagnosticsEngine,
>                                      languageOptions,
>                                      sourceManager,
>                                      PCMCache,
>                                      headerSearch,
>                                      compInst);
>     preprocessor.Initialize(*targetInfo);
>
>     auto filter = llvm::MemoryBuffer::getMemBufferCopy(fullfile);
>
>     sourceManager.setMainFileID(sourceManager.createFileID(std::move(filter)));
>
>     clang::IdentifierTable identifierTable(languageOptions);
>     clang::SelectorTable selectorTable;
>
>     clang::Builtin::Context builtinContext;
>     builtinContext.InitializeTarget(*targetInfo, nullptr);
>     clang::ASTContext astContext(languageOptions,
>                                  sourceManager,
>                                  identifierTable,
>                                  selectorTable,
>                                  builtinContext);
>     astContext.InitBuiltinTypes(*targetInfo);
>     compInst.setTarget(targetInfo.get());
>
>     llvm::LLVMContext context;
>     std::unique_ptr<clang::CodeGenAction> action = std::make_unique<clang::EmitLLVMAction>(&context);
>
>     textDiagnosticPrinter->BeginSourceFile(languageOptions, &preprocessor);
>
>     compInst.ExecuteAction(*action);
>
>
> Then inside the action, even if I created the TargetInfo myself, clang
> tries something nasty:
>
> ASAN:DEADLYSIGNAL
>
> =================================================================
>
> ==25220==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000120
> (pc 0x00010e786a2b bp 0x7ffee23ffda0 sp 0x7ffee23ffcc0 T0)
>
> ==25220==The signal is caused by a WRITE memory access.
>
> ==25220==Hint: address points to the zero page.
>
>     #0 0x10e786a2a in
> clang::TargetInfo::CreateTargetInfo(clang::DiagnosticsEngine&,
> std::__1::shared_ptr<clang::TargetOptions> const&)
> (libATKModelling.dylib:x86_64+0xf19a2a)
>
>     #1 0x10ea7559b in
> clang::CompilerInstance::ExecuteAction(clang::FrontendAction&)
> (libATKModelling.dylib:x86_64+0x120859b)
>
> Cheers,
>
> Matthieu
>
> Le lun. 16 juil. 2018 à 18:00, David Blaikie <dblaikie at gmail.com> a
> écrit :
>
>> I guess a few layers:
>>
>> If you're going source-to-source and want users to see/modify the new
>> source, then making text edits based on source locations found in the AST
>> (but not modifying the AST itself) is generally the suggested idea. If you
>> simultaneously want to produce that source and compile it - yeah, probably
>> easier to write it out, then compile it from that source on the filesystem.
>>
>> (there are probably some ways to compile from source in memory - but I'm
>> not sure of the details, it might involve using the virtual filesystem
>> layers - I think they were implemented for continuous compilation in IDEs
>> (compiling from the edited source buffers open in the editor without having
>> to write them to disk first))
>>
>> On Fri, Jul 13, 2018 at 3:01 PM Matthieu Brucher <
>> matthieu.brucher at gmail.com> wrote:
>>
>>> My domain would be electrical schema modeling. Some people would like to
>>> have the generated code, but then change one model of a component to
>>> something else. Or remove the Newton Raphson algorithm for another one. Or
>>> remove entries in the Jacobian matrix to check for terms that don't bring
>>> much to the result but could enhance performance.
>>> I could write the code in memory and then pass it to clang, but it
>>> feels... odd. But maybe that what I need to do in the end? In there an
>>> example of getting code from a  string?
>>>
>>> Cheers,
>>>
>>> Matthieu
>>>
>>> Le mar. 10 juil. 2018 à 23:17, David Blaikie <dblaikie at gmail.com> a
>>> écrit :
>>>
>>>>
>>>>
>>>> On Tue, Jul 10, 2018 at 2:49 PM Matthieu Brucher <
>>>> matthieu.brucher at gmail.com> wrote:
>>>>
>>>>> That's my use case, it's different than the OP, probably.
>>>>>
>>>>> In my case, I want to generate a first pass, with a JIT (the code is
>>>>> generated from another description), but the generated code could be
>>>>> changed by the user in a subsequent pass.
>>>>>
>>>>
>>>> Curious. As much as possible, I'd encourage you to find ways to not
>>>> have users work with generated code (by abstracting that generated code
>>>> away from them - giving them a higher level representation to write, places
>>>> where the generated code calls back into the user code, etc). But I don't
>>>> know your domain, etc, and wouldn't suggest what is or isn't right for you
>>>> and your users.
>>>>
>>>> But the main takeaway is that modifying the AST and generating code
>>>> from that is discouraged in favor of generating source code edits.
>>>>
>>>>
>>>>> Modifying directly the AST is not an option, try generating equations
>>>>> with thousands of parameters that are solved in real time. Just no way
>>>>> someone can write them efficiently in IR (that's why you have the AST to IR
>>>>> generator!).
>>>>>
>>>>> I don't understand your last paragraph. If clang-format can cleanup
>>>>> rewrites, why can't it reformat code from the AST? If the AST printer
>>>>> writes any kind of code, why couldn't clang-format reformat it?
>>>>>
>>>>
>>>> clang-format could format AST generated source too - I was commenting
>>>> on that in answer to your question "Easier to generate correctly formatted
>>>> code from the AST?" - that it's not easier to generate correctly formatted
>>>> code from the AST than it is from a textual edit. In both cases you'd use
>>>> something like clang-format to tidy up the result. The AST itself doesn't
>>>> have fancy formatting support so it's no better than a textual edit in
>>>> terms of getting nicely formatted results.
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>> Le mar. 10 juil. 2018 à 22:41, David Blaikie <dblaikie at gmail.com> a
>>>>> écrit :
>>>>>
>>>>>> Hmm, not sure I follow.
>>>>>>
>>>>>> Did the user write this source code? Are they going to want to change
>>>>>> it later? Does it make sense for them to see the edits you're suggesting,
>>>>>> or are those edits really compiler optimizations/transformations? If
>>>>>> they're more the latter, then perhaps caching the LLVM IR (with these
>>>>>> optimizations/transformations applied) rather than modifying the source
>>>>>> would be more suitable.
>>>>>>
>>>>>> Easier to generate correctly formatted code from the AST? Not really
>>>>>> - the AST printing doesn't have any particularly nuanced formatted
>>>>>> printing. That's what clang-format is for (it was specifically built for
>>>>>> doing code rewrites based on ASTs - where the rewrite is expressed as a
>>>>>> textual change to the original source (not an AST modification) & that
>>>>>> change is applied, then clang-format is used to tidy it up).
>>>>>>
>>>>>> On Tue, Jul 10, 2018 at 2:11 PM Matthieu Brucher <
>>>>>> matthieu.brucher at gmail.com> wrote:
>>>>>>
>>>>>>> It's odd though, because generating code on the fly would be easier
>>>>>>> on the AST than on the IR tree, if the goal is JIT and also saving the code
>>>>>>> at the same time.
>>>>>>> It's probably also easier also to generate properly formatted code?
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Matthieu
>>>>>>>
>>>>>>> Le mar. 10 juil. 2018 à 16:21, David Blaikie via cfe-dev <
>>>>>>> cfe-dev at lists.llvm.org> a écrit :
>>>>>>>
>>>>>>>> It's generally considered that the AST invariants are too
>>>>>>>> subtle/complex to use AST modification and AST->source conversion reliably.
>>>>>>>> Refactoring/source code modification is generally encouraged to be done via
>>>>>>>> textual edits generated from source location information in the AST.
>>>>>>>>
>>>>>>>> On Mon, Jul 9, 2018 at 8:36 PM Ridwan Shariffdeen via cfe-dev <
>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I am trying to build a tool which can insert new AST nodes to a
>>>>>>>>> AST tree obtained from a source code and generate the modified source code.
>>>>>>>>> For example add an if condition to a given location.
>>>>>>>>>
>>>>>>>>> I have seen examples on ReWriter which can insert text, but I want
>>>>>>>>> to insert a proper AST node and generate the source code from the modified
>>>>>>>>> AST.
>>>>>>>>>
>>>>>>>>> For this purpose, I think I should be using ASTWriter and not
>>>>>>>>> ReWriter. Is there any documentation I can refer on how to implement this?
>>>>>>>>>
>>>>>>>>> Any help in this regard is highly appreciated.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>> Ridwan
>>>>>>>>> _______________________________________________
>>>>>>>>> cfe-dev mailing list
>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> cfe-dev mailing list
>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Quantitative analyst, Ph.D.
>>>>>>> Blog: http://blog.audio-tk.com/
>>>>>>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Quantitative analyst, Ph.D.
>>>>> Blog: http://blog.audio-tk.com/
>>>>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>>>>>
>>>>
>>>
>>> --
>>> Quantitative analyst, Ph.D.
>>> Blog: http://blog.audio-tk.com/
>>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>>>
>>
>
> --
> Quantitative analyst, Ph.D.
> Blog: http://blog.audio-tk.com/
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180717/7b41ff02/attachment.html>