[cfe-dev] [analyzer][RFC] Get info from the LLVM IR for precision

Tue Aug 25 00:37:05 PDT 2020

> And as John says, that'd have the advantage of being more predictable;
we'd no longer have to investigate sudden changes in analysis results that
are in fact caused by backend changes.
I believe that all individual LLVM passes are implemented in a way that we
can reuse them in any exotic pipeline. Of course there are dependencies
between the passes, but besides that I don't think that Clang backend
changes should matter that much. Otherwise, custom pipelines would be a
nightmare to maintain.

> In particular i'm worried for people who treat analyzer warnings as
errors in their builds; for them any update in the compiler would now cause
their build to fail
Well, we could protect them by swallowing all the diags from the CodeGen
part. And if CodeGen has diags then we could omit the IR.

> So i believe that implementing as many of these analyses over the Clang
CFG (or in many cases it might be over the AST as well) would be beneficial
and should be done regardless of this experiment. Gabor, how much did you
try that? Because i believe you should try that and compare the results, at
least for some analyses that are easy to implement.
Yeah, I agree that it is worth trying to implement at least the simplest
ones in the Clang CFG. Thus we would see if anything is missing from our
infra in the CSA and we could compare the results and their performance. I
am thinking about starting with the pureness info, that involves
implementing GlobalsModRef over the Clang CFG.

> The reason why the use of LLVM IR in the static analyzer gets really
interesting is because there are already a huge lot of analyses already
implemented over it and getting access to them "for free" (in terms of
implementation cost) is fairly tempting. I think that's the only real
reason;
There is another reason (as G. Horvath mentions as well): many of the
analyses are quite painful to implement on our current CFG compared to an
already lowered representation like the LLVM IR. However, I agree that
maybe it should not be the LLVM IR that we need to lower. There is a
desire/attempt to use MLIR in Clang. I can't wait to hear the presentation
about CIL (Common MLIR Dialect for C/C++ and Fortran) in the upcoming LLVM
dev meeting, it would be great to know the status.
Still, I think it could take years until we can have a proper Clang
Intermediate Language incorporated into the Clang CFG. Contrary to this, we
could immediately start to use already implemented analyses on top of the
LLVM IR.

Gabor

On Mon, Aug 17, 2020 at 1:31 PM Gábor Horváth <xazax.hun at gmail.com> wrote:

>
> On Sun, 16 Aug 2020 at 21:57, Artem Dergachev <noqnoqneo at gmail.com> wrote:
>
>>
>> So i believe that implementing as many of these analyses over the Clang
>> CFG (or in many cases it might be over the AST as well) would be beneficial
>> and should be done regardless of this experiment.
>>
>
> While I do agree that this would be awesome, I think many of those
> analyses are quite painful to implement on our current CFG compared to an
> already lowered representation like the LLVM IR which can be canonicalized
> and there are fewer corner cases and peculiarities to handle compared to
> the C++ language. Having the option to derive certain information from a
> representation that is easier to work with for some purposes might be
> useful for future analyses as well, not only for leveraging currently
> implemented analyses. Having a proper Clang IR could of course void this
> argument.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200825/8863899a/attachment-0001.html>