[cfe-dev] AST Writer

Matthieu Brucher via cfe-dev cfe-dev at lists.llvm.org
Tue Jul 17 00:00:21 PDT 2018


Indeed, that's what I'm now aiming at. Unfortunately, it seems that there
are no examples as how to use FrontEndAction properly with clang 6.0.0. I
can use libeling with runToolOnCode to generate a module, but the triple is
not set up properly in that case when I want to use the JIT. And it seems
to be a problem with clang, as if I do this:

    clang::DiagnosticOptions diagnosticOptions;
    std::unique_ptr<clang::TextDiagnosticPrinter> textDiagnosticPrinter =
      std::make_unique<clang::TextDiagnosticPrinter>(llvm::outs(),
                                                     &diagnosticOptions);
    llvm::IntrusiveRefCntPtr<clang::DiagnosticIDs> diagIDs;

    std::unique_ptr<clang::DiagnosticsEngine> diagnosticsEngine =
      std::make_unique<clang::DiagnosticsEngine>(diagIDs,
&diagnosticOptions, textDiagnosticPrinter.get());

    clang::LangOptions languageOptions;
    clang::FileSystemOptions fileSystemOptions;
    clang::FileManager fileManager(fileSystemOptions);
    clang::SourceManager sourceManager(*diagnosticsEngine,
                                       fileManager);
    std::shared_ptr<clang::HeaderSearchOptions>
headerSearchOptions(new clang::HeaderSearchOptions());

    const std::shared_ptr<clang::TargetOptions> targetOptions =
std::make_shared<clang::TargetOptions>();
    targetOptions->Triple = llvm::sys::getDefaultTargetTriple();

    std::unique_ptr<clang::TargetInfo> targetInfo(
      clang::TargetInfo::CreateTargetInfo(*diagnosticsEngine, targetOptions));

    clang::HeaderSearch headerSearch(headerSearchOptions,
                                     sourceManager,
                                     *diagnosticsEngine,
                                     languageOptions,
                                     targetInfo.get());
    clang::MemoryBufferCache PCMCache;
    clang::CompilerInstance compInst;

    std::shared_ptr<clang::PreprocessorOptions>
opts(std::make_shared<clang::PreprocessorOptions>());
    clang::Preprocessor preprocessor(opts,
                                     *diagnosticsEngine,
                                     languageOptions,
                                     sourceManager,
                                     PCMCache,
                                     headerSearch,
                                     compInst);
    preprocessor.Initialize(*targetInfo);

    auto filter = llvm::MemoryBuffer::getMemBufferCopy(fullfile);

    sourceManager.setMainFileID(sourceManager.createFileID(std::move(filter)));

    clang::IdentifierTable identifierTable(languageOptions);
    clang::SelectorTable selectorTable;

    clang::Builtin::Context builtinContext;
    builtinContext.InitializeTarget(*targetInfo, nullptr);
    clang::ASTContext astContext(languageOptions,
                                 sourceManager,
                                 identifierTable,
                                 selectorTable,
                                 builtinContext);
    astContext.InitBuiltinTypes(*targetInfo);
    compInst.setTarget(targetInfo.get());

    llvm::LLVMContext context;
    std::unique_ptr<clang::CodeGenAction> action =
std::make_unique<clang::EmitLLVMAction>(&context);

    textDiagnosticPrinter->BeginSourceFile(languageOptions, &preprocessor);

    compInst.ExecuteAction(*action);


Then inside the action, even if I created the TargetInfo myself, clang
tries something nasty:

ASAN:DEADLYSIGNAL

=================================================================

==25220==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000120
(pc 0x00010e786a2b bp 0x7ffee23ffda0 sp 0x7ffee23ffcc0 T0)

==25220==The signal is caused by a WRITE memory access.

==25220==Hint: address points to the zero page.

    #0 0x10e786a2a in
clang::TargetInfo::CreateTargetInfo(clang::DiagnosticsEngine&,
std::__1::shared_ptr<clang::TargetOptions> const&)
(libATKModelling.dylib:x86_64+0xf19a2a)

    #1 0x10ea7559b in
clang::CompilerInstance::ExecuteAction(clang::FrontendAction&)
(libATKModelling.dylib:x86_64+0x120859b)

Cheers,

Matthieu

Le lun. 16 juil. 2018 à 18:00, David Blaikie <dblaikie at gmail.com> a écrit :

> I guess a few layers:
>
> If you're going source-to-source and want users to see/modify the new
> source, then making text edits based on source locations found in the AST
> (but not modifying the AST itself) is generally the suggested idea. If you
> simultaneously want to produce that source and compile it - yeah, probably
> easier to write it out, then compile it from that source on the filesystem.
>
> (there are probably some ways to compile from source in memory - but I'm
> not sure of the details, it might involve using the virtual filesystem
> layers - I think they were implemented for continuous compilation in IDEs
> (compiling from the edited source buffers open in the editor without having
> to write them to disk first))
>
> On Fri, Jul 13, 2018 at 3:01 PM Matthieu Brucher <
> matthieu.brucher at gmail.com> wrote:
>
>> My domain would be electrical schema modeling. Some people would like to
>> have the generated code, but then change one model of a component to
>> something else. Or remove the Newton Raphson algorithm for another one. Or
>> remove entries in the Jacobian matrix to check for terms that don't bring
>> much to the result but could enhance performance.
>> I could write the code in memory and then pass it to clang, but it
>> feels... odd. But maybe that what I need to do in the end? In there an
>> example of getting code from a  string?
>>
>> Cheers,
>>
>> Matthieu
>>
>> Le mar. 10 juil. 2018 à 23:17, David Blaikie <dblaikie at gmail.com> a
>> écrit :
>>
>>>
>>>
>>> On Tue, Jul 10, 2018 at 2:49 PM Matthieu Brucher <
>>> matthieu.brucher at gmail.com> wrote:
>>>
>>>> That's my use case, it's different than the OP, probably.
>>>>
>>>> In my case, I want to generate a first pass, with a JIT (the code is
>>>> generated from another description), but the generated code could be
>>>> changed by the user in a subsequent pass.
>>>>
>>>
>>> Curious. As much as possible, I'd encourage you to find ways to not have
>>> users work with generated code (by abstracting that generated code away
>>> from them - giving them a higher level representation to write, places
>>> where the generated code calls back into the user code, etc). But I don't
>>> know your domain, etc, and wouldn't suggest what is or isn't right for you
>>> and your users.
>>>
>>> But the main takeaway is that modifying the AST and generating code from
>>> that is discouraged in favor of generating source code edits.
>>>
>>>
>>>> Modifying directly the AST is not an option, try generating equations
>>>> with thousands of parameters that are solved in real time. Just no way
>>>> someone can write them efficiently in IR (that's why you have the AST to IR
>>>> generator!).
>>>>
>>>> I don't understand your last paragraph. If clang-format can cleanup
>>>> rewrites, why can't it reformat code from the AST? If the AST printer
>>>> writes any kind of code, why couldn't clang-format reformat it?
>>>>
>>>
>>> clang-format could format AST generated source too - I was commenting on
>>> that in answer to your question "Easier to generate correctly formatted
>>> code from the AST?" - that it's not easier to generate correctly formatted
>>> code from the AST than it is from a textual edit. In both cases you'd use
>>> something like clang-format to tidy up the result. The AST itself doesn't
>>> have fancy formatting support so it's no better than a textual edit in
>>> terms of getting nicely formatted results.
>>>
>>>
>>>>
>>>>
>>>>
>>>> Le mar. 10 juil. 2018 à 22:41, David Blaikie <dblaikie at gmail.com> a
>>>> écrit :
>>>>
>>>>> Hmm, not sure I follow.
>>>>>
>>>>> Did the user write this source code? Are they going to want to change
>>>>> it later? Does it make sense for them to see the edits you're suggesting,
>>>>> or are those edits really compiler optimizations/transformations? If
>>>>> they're more the latter, then perhaps caching the LLVM IR (with these
>>>>> optimizations/transformations applied) rather than modifying the source
>>>>> would be more suitable.
>>>>>
>>>>> Easier to generate correctly formatted code from the AST? Not really -
>>>>> the AST printing doesn't have any particularly nuanced formatted printing.
>>>>> That's what clang-format is for (it was specifically built for doing code
>>>>> rewrites based on ASTs - where the rewrite is expressed as a textual change
>>>>> to the original source (not an AST modification) & that change is applied,
>>>>> then clang-format is used to tidy it up).
>>>>>
>>>>> On Tue, Jul 10, 2018 at 2:11 PM Matthieu Brucher <
>>>>> matthieu.brucher at gmail.com> wrote:
>>>>>
>>>>>> It's odd though, because generating code on the fly would be easier
>>>>>> on the AST than on the IR tree, if the goal is JIT and also saving the code
>>>>>> at the same time.
>>>>>> It's probably also easier also to generate properly formatted code?
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Matthieu
>>>>>>
>>>>>> Le mar. 10 juil. 2018 à 16:21, David Blaikie via cfe-dev <
>>>>>> cfe-dev at lists.llvm.org> a écrit :
>>>>>>
>>>>>>> It's generally considered that the AST invariants are too
>>>>>>> subtle/complex to use AST modification and AST->source conversion reliably.
>>>>>>> Refactoring/source code modification is generally encouraged to be done via
>>>>>>> textual edits generated from source location information in the AST.
>>>>>>>
>>>>>>> On Mon, Jul 9, 2018 at 8:36 PM Ridwan Shariffdeen via cfe-dev <
>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am trying to build a tool which can insert new AST nodes to a AST
>>>>>>>> tree obtained from a source code and generate the modified source code. For
>>>>>>>> example add an if condition to a given location.
>>>>>>>>
>>>>>>>> I have seen examples on ReWriter which can insert text, but I want
>>>>>>>> to insert a proper AST node and generate the source code from the modified
>>>>>>>> AST.
>>>>>>>>
>>>>>>>> For this purpose, I think I should be using ASTWriter and not
>>>>>>>> ReWriter. Is there any documentation I can refer on how to implement this?
>>>>>>>>
>>>>>>>> Any help in this regard is highly appreciated.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> Ridwan
>>>>>>>> _______________________________________________
>>>>>>>> cfe-dev mailing list
>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> cfe-dev mailing list
>>>>>>> cfe-dev at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Quantitative analyst, Ph.D.
>>>>>> Blog: http://blog.audio-tk.com/
>>>>>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Quantitative analyst, Ph.D.
>>>> Blog: http://blog.audio-tk.com/
>>>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>>>>
>>>
>>
>> --
>> Quantitative analyst, Ph.D.
>> Blog: http://blog.audio-tk.com/
>> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>>
>

-- 
Quantitative analyst, Ph.D.
Blog: http://blog.audio-tk.com/
LinkedIn: http://www.linkedin.com/in/matthieubrucher
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180717/b691e6d2/attachment.html>


More information about the cfe-dev mailing list