<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/71554>71554</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[Flang] Feature request/Discussion: Preprocessor/Prescanner callbacks
</td>
</tr>
<tr>
<th>Labels</th>
<td>
flang
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
tarcisiofischer
</td>
</tr>
</table>
<pre>
Hello!
I am working on a project called [Codevis](https://invent.kde.org/sdk/codevis), which is able to read C and C++ source code and extract a database representation from it, which is used (among other things) to be visualized in a graph-like structure in a Qt-based application. I am currently running some experiments to read Fortran code on Codevis, and I am using flang to do that.
I am already able to access the Fortran AST and walk around it and extract what we call the "logical representation" of the code. As for a first try, been quite easy, and I must say I really like how things are done in flang :)
The other representation on Codevis, which is the "physical representation" is not related to the data structures, but in relation to the files and inclusions. Basically, for each file, I need to get all `INCLUDE` and persist that file `a.f` depends on `b.f` given that include.
I have been able to hack the code to make it work, by adding a callback inside the Fortran preprocessor. More specifically, around here:
https://github.com/llvm/llvm-project/blob/75d6795e420274346b14aca8b6bd49bfe6030eeb/flang/lib/Parser/prescan.cpp#L896
My current hack is something like this:
```
const SourceFile *included{
allSources_.Open(path, error, std::move(prependPath))};
if (!included) {
Say(provenance, "INCLUDE: %s"_err_en_US, error.str());
} else if (included->bytes() > 0) {
ProvenanceRange includeLineRange{
provenance, static_cast<std::size_t>(p - nextLine_)};
ProvenanceRange fileRange{
allSources_.AddIncludedFile(*included, includeLineRange)};
Prescanner{*this}.set_encoding(included->encoding()).Prescan(fileRange);
if (onFortranInclude) {
(*onFortranInclude)(included->path());
}
}
```
Which is, of course, not ideal, but it works for a first experiment.
I would like to start a conversation in the direction of making ways of intercepting the preprocessor and the prescanner in such a way that I'd have access to those things without having to hack it.
Question 1: Would this be something that LLVM/Flang community would be interested in?
Question 2: Would you be open to start a conversation regarding how to proper design the solution, perhaps having clang's `PPCallback`s as an example to be followed?
If you need further information please let me know.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyUVk1v4zgS_TX0pRBBpmzJPvjg2G1sgPRuZntm5xhQZMnihiY1JGXH8-sXRckfSboP2wjSMV2sj1ev6lGEoPcWccXmj2y-nYg-ts6vovBSB-0aHWSLflI7dV79A41xjE9ZvmX5evj9BOIAJ-fftN2DsyCg8-6_KCNIYQwqYPPHjVN41IHNt4wv2hi7wIo14zvGd9oe0cbsTWHm_J7xXVBvjO_keIMvGd_AqdWyBR1A1AYhOvAoFGxAWAUbxh8Zf4Tgei8R6GI6x_fohYwgQIkoahEQPHYeA9ooonYWGu8OoOOHAH2gjPlCHByVE1v0EFtt95QKRa4Rjjr0wui_UYGmevdedO2D0W8IIfpext7j8M1v8YECKxBdZ7RMYTNIiMnee7TRnMH31hJ2wR0Q8L1Drw9oY7jWuXM-emGH0pyFC5p8ACC56wO5aIywe7qnHMRWxOxLo4Qhl-crkEJKDAFii9cw6x-_J78nYd5AeNdbBTp-wPTUiggnTB1Odxnnxu21FOYTyIxzcE2yofQzWAdonAcBjfYhQvRnqqNGtPBXryMCinC-lXboQ4QgzvBEWBhzhoRz605jW0B4BOVsQnyoP3FreV_67y2OvfxEgY9oXlkwltS15_CLmnQA6yJ4NCKiIiTpDjHtxoHks-4jZZYMKeJo2WiDIdWorTR90M6GDB5FimcSAAQTCtkmWzp4AotDrD1GIOhZmT_9c_P8x_YbK_PkrUMfNOFKHaKLZCOyhr5X2KFVgYpmZV4Ph3t9RDuYp0wUZhe6tOKIQ2cubGmFfLv2kg4O4g2JHDT_qdozCKWIiiKRo6YL2gZN5nccg44gdcQ95zP47jxC6FDq5lb_SL0WPVJHU1Ifd8dex7avM-kOjO-MOV7-exg3EOO72ria8V01V2W1nOOM57yaFbOyns6EFIu6rNVsWTdY5kWOSKaJQ-RI06cX4QN6xndEAClsJruO8eJ5sSzvCfb9fBnoASMd0jgnig6Mja0O1zJYmY8_6SOAdDZE-JFW2C51ja_HdihWPV7M6J8wZrALr9m_OrSMLzoRW0IMvXee_ghRUaxifXBHJAOfWv-S7Jb0U21ZcXWrG1p5jE-vIfkSPoT9Ic7JjTuiFVYmOjLOL-Qr1sD4PDDOX9H7V7Svf_y4JpSF6JP7FPgWlVVbQBNwDH-J_cCKb_U50vQsUh7FN8i_JPRyTeXfwu7xwt1nbYeDT6AR4e5zDzTJ8lWKEFmxucIV9N_4GlnxjYqFB7D4Hsnl6xfIvmZAw_bz0PcdWyv1NBa6S2O9uO8033wt5FPkl4GHFj3F4evEq2qbBYyvaKWj4fuE5t3x0IRsdML44pb1XW_GUENfnB2Hdsz7YyuGAn5i8ymHgaJfSZBocMeIn87H8PvPcTsTSq4B6XofUjNpEWuFwlz37bCQPirNTVmvC-7keqPG8XTECU_PBensEX0Y1rW2w2LXHuWgGA0tPRrrkzgH-qhtRC-xi3RIxve7LS3l8XDsG_kMvWxBkIth9T4xXqlh4V4UmXTCBbzI3EnH1vW0Xo560Phh0XzU-N96DCnNKc3kn6k-Ygg9W24LKYV8fv7Pd8Z3uySZ0h0OvdXxPGJS41AWhpheOazYfQrAbwHOrqcLrkP7Sxw97oVPwpCk29E8duhBIb0-E0LBmX6Q1w3JWCu6cKlWDju5CiRcLy-bUVlYmQcQJKOA7-LQDSJVIzTOGHdCdU177HiTck0q2vQ-PQi0bZw_DEl2BumZaDDCAeHNulM2UatCLYulmOBqWi6XRVXMquWkXc2kWFaiRj4t69mUl818qapFUWC1xEWJ9USveM6L6TSvpuV0Ol1mArGoC46zeVlWucrZLMeD0CYjyaLH70SH0OOqms7ns4kRNZqQHuWcj5rE6X3uV0ni6n4f2Cw3OsRw8xB1NOkln9rK5lvYoUjvUY9_UesY3211kH2gFwe18OWOrKR3N5pe9DtMem9W_7fypmICaS_V878AAAD__7SBDXs">