[cfe-dev] AST Writer

mats petersson via cfe-dev cfe-dev at lists.llvm.org
Wed Jul 18 03:20:35 PDT 2018


Try having a look at an OpenCL implementation - pocl is the one that comes
to mind. OpenCL relies on taking a string and outputting code, all in
memory [the spec doesn't precisely say you can't generate a file and
compile that through a standalone executable, but that's not exactly a
"nice" solution].

I work on ARM's OpenCL solution, so I'm not familiar with the details of
the pocl, but I'm 100% sure that they do something similar to what we do -
build/take a string, call various parts of clang functions, and produce a
binary executable in memory.

It may not be 100% like what you want to do, but it should give you
something to start from.

--
Mats

On 17 July 2018 at 19:08, Matthieu Brucher via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> Hi,
>
> I can easily add source code to a file. It's a no brainer, I'm not going
> to use clang for this, it's overkill.
> What doesn't work, as stated in my previous example, is getting a module
> out of clang, a module that can be used inside llvm. When executing the
> code below, I get a write error. That's a problem because there are no
> resources online on this issue. The api changes too quickly for this, and
> even lib clang doesn't help because then the triple is not set (and then
> llvm breaks).
>
> Cheers
> Matthieu
>
> Le mar. 17 juil. 2018 à 18:25, Firat Kasmis <firatkasmis at gmail.com> a
> écrit :
>
>> Matthieu, try https://github.com/firolino/clang-tool as getting started
>> and change the transformer to your needs to insert code/text at a given
>> location. Hope it helps.
>>
>> Best,
>> Firat
>>
>> Am Di., 17. Juli 2018 um 09:00 Uhr schrieb Matthieu Brucher via cfe-dev <
>> cfe-dev at lists.llvm.org>:
>>
>>> Indeed, that's what I'm now aiming at. Unfortunately, it seems that
>>> there are no examples as how to use FrontEndAction properly with clang
>>> 6.0.0. I can use libeling with runToolOnCode to generate a module, but the
>>> triple is not set up properly in that case when I want to use the JIT. And
>>> it seems to be a problem with clang, as if I do this:
>>>
>>>     clang::DiagnosticOptions diagnosticOptions;
>>>     std::unique_ptr<clang::TextDiagnosticPrinter> textDiagnosticPrinter =
>>>       std::make_unique<clang::TextDiagnosticPrinter>(llvm::outs(),
>>>                                                      &diagnosticOptions);
>>>     llvm::IntrusiveRefCntPtr<clang::DiagnosticIDs> diagIDs;
>>>
>>>     std::unique_ptr<clang::DiagnosticsEngine> diagnosticsEngine =
>>>       std::make_unique<clang::DiagnosticsEngine>(diagIDs, &diagnosticOptions, textDiagnosticPrinter.get());
>>>
>>>     clang::LangOptions languageOptions;
>>>     clang::FileSystemOptions fileSystemOptions;
>>>     clang::FileManager fileManager(fileSystemOptions);
>>>     clang::SourceManager sourceManager(*diagnosticsEngine,
>>>                                        fileManager);
>>>     std::shared_ptr<clang::HeaderSearchOptions> headerSearchOptions(new clang::HeaderSearchOptions());
>>>
>>>     const std::shared_ptr<clang::TargetOptions> targetOptions = std::make_shared<clang::TargetOptions>();
>>>     targetOptions->Triple = llvm::sys::getDefaultTargetTriple();
>>>
>>>     std::unique_ptr<clang::TargetInfo> targetInfo(
>>>       clang::TargetInfo::CreateTargetInfo(*diagnosticsEngine, targetOptions));
>>>
>>>     clang::HeaderSearch headerSearch(headerSearchOptions,
>>>                                      sourceManager,
>>>                                      *diagnosticsEngine,
>>>                                      languageOptions,
>>>                                      targetInfo.get());
>>>     clang::MemoryBufferCache PCMCache;
>>>     clang::CompilerInstance compInst;
>>>
>>>     std::shared_ptr<clang::PreprocessorOptions> opts(std::make_shared<clang::PreprocessorOptions>());
>>>     clang::Preprocessor preprocessor(opts,
>>>                                      *diagnosticsEngine,
>>>                                      languageOptions,
>>>                                      sourceManager,
>>>                                      PCMCache,
>>>                                      headerSearch,
>>>                                      compInst);
>>>     preprocessor.Initialize(*targetInfo);
>>>
>>>     auto filter = llvm::MemoryBuffer::getMemBufferCopy(fullfile);
>>>
>>>     sourceManager.setMainFileID(sourceManager.createFileID(std::move(filter)));
>>>
>>>     clang::IdentifierTable identifierTable(languageOptions);
>>>     clang::SelectorTable selectorTable;
>>>
>>>     clang::Builtin::Context builtinContext;
>>>     builtinContext.InitializeTarget(*targetInfo, nullptr);
>>>     clang::ASTContext astContext(languageOptions,
>>>                                  sourceManager,
>>>                                  identifierTable,
>>>                                  selectorTable,
>>>                                  builtinContext);
>>>     astContext.InitBuiltinTypes(*targetInfo);
>>>     compInst.setTarget(targetInfo.get());
>>>
>>>     llvm::LLVMContext context;
>>>     std::unique_ptr<clang::CodeGenAction> action = std::make_unique<clang::EmitLLVMAction>(&context);
>>>
>>>     textDiagnosticPrinter->BeginSourceFile(languageOptions, &preprocessor);
>>>
>>>     compInst.ExecuteAction(*action);
>>>
>>>
>>> Then inside the action, even if I created the TargetInfo myself, clang
>>> tries something nasty:
>>>
>>> ASAN:DEADLYSIGNAL
>>>
>>> =================================================================
>>>
>>> ==25220==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000120
>>> (pc 0x00010e786a2b bp 0x7ffee23ffda0 sp 0x7ffee23ffcc0 T0)
>>>
>>> ==25220==The signal is caused by a WRITE memory access.
>>>
>>> ==25220==Hint: address points to the zero page.
>>>
>>>     #0 0x10e786a2a in clang::TargetInfo::CreateTargetInfo(clang::DiagnosticsEngine&,
>>> std::__1::shared_ptr<clang::TargetOptions> const&)
>>> (libATKModelling.dylib:x86_64+0xf19a2a)
>>>
>>>     #1 0x10ea7559b in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&)
>>> (libATKModelling.dylib:x86_64+0x120859b)
>>>
>>> Cheers,
>>>
>>> Matthieu
>>>
>>> Le lun. 16 juil. 2018 à 18:00, David Blaikie <dblaikie at gmail.com> a
>>> écrit :
>>>
>>>> I guess a few layers:
>>>>
>>>> If you're going source-to-source and want users to see/modify the new
>>>> source, then making text edits based on source locations found in the AST
>>>> (but not modifying the AST itself) is generally the suggested idea. If you
>>>> simultaneously want to produce that source and compile it - yeah, probably
>>>> easier to write it out, then compile it from that source on the filesystem.
>>>>
>>>> (there are probably some ways to compile from source in memory - but
>>>> I'm not sure of the details, it might involve using the virtual filesystem
>>>> layers - I think they were implemented for continuous compilation in IDEs
>>>> (compiling from the edited source buffers open in the editor without having
>>>> to write them to disk first))
>>>>
>>>> On Fri, Jul 13, 2018 at 3:01 PM Matthieu Brucher <
>>>> matthieu.brucher at gmail.com> wrote:
>>>>
>>>>> My domain would be electrical schema modeling. Some people would like
>>>>> to have the generated code, but then change one model of a component to
>>>>> something else. Or remove the Newton Raphson algorithm for another one. Or
>>>>> remove entries in the Jacobian matrix to check for terms that don't bring
>>>>> much to the result but could enhance performance.
>>>>> I could write the code in memory and then pass it to clang, but it
>>>>> feels... odd. But maybe that what I need to do in the end? In there an
>>>>> example of getting code from a  string?
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Matthieu
>>>>>
>>>>> Le mar. 10 juil. 2018 à 23:17, David Blaikie <dblaikie at gmail.com> a
>>>>> écrit :
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 10, 2018 at 2:49 PM Matthieu Brucher <
>>>>>> matthieu.brucher at gmail.com> wrote:
>>>>>>
>>>>>>> That's my use case, it's different than the OP, probably.
>>>>>>>
>>>>>>> In my case, I want to generate a first pass, with a JIT (the code is
>>>>>>> generated from another description), but the generated code could be
>>>>>>> changed by the user in a subsequent pass.
>>>>>>>
>>>>>>
>>>>>> Curious. As much as possible, I'd encourage you to find ways to not
>>>>>> have users work with generated code (by abstracting that generated code
>>>>>> away from them - giving them a higher level representation to write, places
>>>>>> where the generated code calls back into the user code, etc). But I don't
>>>>>> know your domain, etc, and wouldn't suggest what is or isn't right for you
>>>>>> and your users.
>>>>>>
>>>>>> But the main takeaway is that modifying the AST and generating code
>>>>>> from that is discouraged in favor of generating source code edits.
>>>>>>
>>>>>>
>>>>>>> Modifying directly the AST is not an option, try generating
>>>>>>> equations with thousands of parameters that are solved in real time. Just
>>>>>>> no way someone can write them efficiently in IR (that's why you have the
>>>>>>> AST to IR generator!).
>>>>>>>
>>>>>>> I don't understand your last paragraph. If clang-format can cleanup
>>>>>>> rewrites, why can't it reformat code from the AST? If the AST printer
>>>>>>> writes any kind of code, why couldn't clang-format reformat it?
>>>>>>>
>>>>>>
>>>>>> clang-format could format AST generated source too - I was commenting
>>>>>> on that in answer to your question "Easier to generate correctly formatted
>>>>>> code from the AST?" - that it's not easier to generate correctly formatted
>>>>>> code from the AST than it is from a textual edit. In both cases you'd use
>>>>>> something like clang-format to tidy up the result. The AST itself doesn't
>>>>>> have fancy formatting support so it's no better than a textual edit in
>>>>>> terms of getting nicely formatted results.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Le mar. 10 juil. 2018 à 22:41, David Blaikie <dblaikie at gmail.com> a
>>>>>>> écrit :
>>>>>>>
>>>>>>>> Hmm, not sure I follow.
>>>>>>>>
>>>>>>>> Did the user write this source code? Are they going to want to
>>>>>>>> change it later? Does it make sense for them to see the edits you're
>>>>>>>> suggesting, or are those edits really compiler
>>>>>>>> optimizations/transformations? If they're more the latter, then perhaps
>>>>>>>> caching the LLVM IR (with these optimizations/transformations applied)
>>>>>>>> rather than modifying the source would be more suitable.
>>>>>>>>
>>>>>>>> Easier to generate correctly formatted code from the AST? Not
>>>>>>>> really - the AST printing doesn't have any particularly nuanced formatted
>>>>>>>> printing. That's what clang-format is for (it was specifically built for
>>>>>>>> doing code rewrites based on ASTs - where the rewrite is expressed as a
>>>>>>>> textual change to the original source (not an AST modification) & that
>>>>>>>> change is applied, then clang-format is used to tidy it up).
>>>>>>>>
>>>>>>>> On Tue, Jul 10, 2018 at 2:11 PM Matthieu Brucher <
>>>>>>>> matthieu.brucher at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> It's odd though, because generating code on the fly would be
>>>>>>>>> easier on the AST than on the IR tree, if the goal is JIT and also saving
>>>>>>>>> the code at the same time.
>>>>>>>>> It's probably also easier also to generate properly formatted code?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Matthieu
>>>>>>>>>
>>>>>>>>> Le mar. 10 juil. 2018 à 16:21, David Blaikie via cfe-dev <
>>>>>>>>> cfe-dev at lists.llvm.org> a écrit :
>>>>>>>>>
>>>>>>>>>> It's generally considered that the AST invariants are too
>>>>>>>>>> subtle/complex to use AST modification and AST->source conversion reliably.
>>>>>>>>>> Refactoring/source code modification is generally encouraged to be done via
>>>>>>>>>> textual edits generated from source location information in the AST.
>>>>>>>>>>
>>>>>>>>>> On Mon, Jul 9, 2018 at 8:36 PM Ridwan Shariffdeen via cfe-dev <
>>>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I am trying to build a tool which can insert new AST nodes to a
>>>>>>>>>>> AST tree obtained from a source code and generate the modified source code.
>>>>>>>>>>> For example add an if condition to a given location.
>>>>>>>>>>>
>>>>>>>>>>> I have seen examples on ReWriter which can insert text, but I
>>>>>>>>>>> want to insert a proper AST node and generate the source code from the
>>>>>>>>>>> modified AST.
>>>>>>>>>>>
>>>>>>>>>>> For this purpose, I think I should be using ASTWriter and not
>>>>>>>>>>> ReWriter. Is there any documentation I can refer on how to implement this?
>>>>>>>>>>>
>>>>>>>>>>> Any help in this regard is highly appreciated.
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>> Ridwan
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Quantitative analyst, Ph.D.
>>>>>>>>> Blog: http://blog.audio-tk.com/
>>>>>>>>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Quantitative analyst, Ph.D.
>>>>>>> Blog: http://blog.audio-tk.com/
>>>>>>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Quantitative analyst, Ph.D.
>>>>> Blog: http://blog.audio-tk.com/
>>>>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>>>>>
>>>>
>>>
>>> --
>>> Quantitative analyst, Ph.D.
>>> Blog: http://blog.audio-tk.com/
>>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180718/808a088b/attachment.html>


More information about the cfe-dev mailing list