<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Gábor:<br>

    </p>

    <blockquote type="cite"

cite="mid:CAPRL4a2XeiLvVC=BtmLnXDQxrR5=T24S5kX3fAq08fTPfdpqqQ@mail.gmail.com">

      <div dir="ltr">

        <div><br>

        </div>

        <div>I was wondering what are the constraints with the plist

          format that makes you want to output Sarif natively. I am not

          opposed to adding another output format but if you miss

          something from the plist during conversion, chances are good

          other consumers of the plist would find that information

          useful. As far as I understand, the current plist format is

          not frozen in any ways, so extending that is also an option

          (regardless of adding

          a new output format or not ).</div>

      </div>

    </blockquote>

    That's a good point. My understanding (I concede that my knowledge

    of the details is second hand), is that the plist format is quite

    tightly tied to the way the paths are visualized within XCode. 

    Other viewers like to show path visualizations in other ways, but

    supporting those that didn't seem to be possible in plist. In

    particular, there are good reasons to have different levels of

    "importance" associated with points and edges. Also, there didn't

    seem to be anything in plist that would allow you to express more

    than one thread. Finally, there is lots of metadata about the

    analysis that can't be expressed in plist.<br>

    <br>

    I certainly agree that it would have been possible to extend plist

    in many directions to compensate, but our feeling is that we would

    just end up with something that approached the expressiveness of

    SARIF, so it makes more sense to just do SARIF natively.<br>

    <br>

    Also, Cppcheck outputs plist too, so extending it for Clang (if not

    done in a backwards-compatible way) would break any consumers of

    output from that tool.<br>

    <br>

    I hope this helps,<br>

    <br>

    -Paul<br>

    <br>

    <blockquote type="cite"

cite="mid:CAPRL4a2XeiLvVC=BtmLnXDQxrR5=T24S5kX3fAq08fTPfdpqqQ@mail.gmail.com">

      <div dir="ltr">

        <div><br>

        </div>

        <div>Regards,</div>

        <div>Gabor<br>

        </div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr">On Mon, 17 Sep 2018 at 14:36, Artem Dergachev via

          cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org"

            moz-do-not-send="true">cfe-dev@lists.llvm.org</a>> wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0 0 0

          .8ex;border-left:1px #ccc solid;padding-left:1ex">Hmm, this

          looks useful. I'd love to see how we perform compared to other

          <br>

          tools and have a look at interesting false negatives that such

          <br>

          comparison would be able to find, though i understand that

          this sort of <br>

          comparisons are hard because different tools may report the

          same bug in <br>

          different manners, on different lines of code, with different

          warnings <br>

          and notes, so even if they provide it in the same format,

          matching them <br>

          to each other automatically may be hard.<br>

          <br>

          Analyzer outputs are implemented by PathDiagnosticConsumer

          sub-classes, <br>

          and it should be fairly straightforward to add a new

          sub-class. You need <br>

          to handle different "diagnostic pieces" (events along the

          path, <br>

          directions on how does the path run through the program, etc.)

          Please <br>

          let us know if you think that the class is not receiving

          enough info to <br>

          fill in everything you want to provide - we could probably

          provide it.<br>

          <br>

          As far as I understand, you want to eventually upstream your

          work. In <br>

          this case I encourage you to start as early as possible (i.e.,

          even if <br>

          it's an empty implementation that emits empty files), by

          posting early <br>

          prototypes on our Phabricator and then adding incremental

          patches on top <br>

          of it, rather than wait until your code is finished.

          Essentially, LLVM <br>

          development policy promotes run-time flags as branches and

          discourages <br>

          huge pull-requests from distant forks because otherwise it's

          relatively <br>

          easy to take a wrong turn. We'll be able to consult you on

          what do all <br>

          these notes and events mean or on other stuff of ours. There

          have been <br>

          recent changes in how consumers are handled, so please make

          sure you <br>

          work with a recent clang.<br>

          <br>

          <br>

          On 9/17/18 10:51 AM, Paul Anderson via cfe-dev wrote:<br>

          > All:<br>

          ><br>

          > This is my first post to this list, so first, let me give

          a quick <br>

          > introduction. I'm VP of Engineering at GrammaTech, where

          I am in <br>

          > charge of an advanced static analysis tool named

          CodeSonar. It <br>

          > primarily works for C and C++, but also for x86, x64 and

          ARM binaries. <br>

          > There is a little overlap with what CSA does, but

          CodeSonar's strength <br>

          > is in whole-program path-sensitive analysis for serious

          defects and <br>

          > security vulnerabilities.<br>

          ><br>

          > I'm writing to let the community know of some work we

          will be doing <br>

          > that should benefit everyone. I think I know the best way

          forward, but <br>

          > I'd appreciate any words of wisdom and feedback on our

          approach.<br>

          ><br>

          > This work is funded by a government research project

          aimed at <br>

          > modernizing open source static analysis tools. The

          project is named <br>

          > STAMP (the official funding agency page, which is

          admittedly very <br>

          > short on details, is here: <br>

          > <a

            href="https://www.dhs.gov/science-and-technology/csd-stamp"

            rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.dhs.gov/science-and-technology/csd-stamp</a>.)<br>

          ><br>

          > There are several thrusts, but the piece I have been

          working on is <br>

          > aimed at changing tools so that they can communicate more

          effectively <br>

          > with each other. Ultimately there will be a protocol to

          allow tools to <br>

          > exchange information actively, but the first part is

          simpler and <br>

          > fairly straightforward. We will be modifying tools so

          that they can <br>

          > output results in SARIF, a standard output format for

          static analysis <br>

          > tools: <br>

          > <a

            href="https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=sarif"

            rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=sarif</a>.

          The <br>

          > standard was first conceived at Microsoft. I'm on the TC,

          along with <br>

          > representatives from other tool vendors and interested

          users.<br>

          ><br>

          > We've already written an adapter for CSA that can take

          plist-format <br>

          > output and convert it to SARIF, and we plan to make that

          available <br>

          > shortly. However due to constraints on what is

          expressible with that <br>

          > format, we feel we can do a much better job if we change

          the analyzer <br>

          > to output SARIF natively, controlled by (say)

          -analyzer-output=sarif.<br>

          ><br>

          > We've done some prototyping of this on a fork and have it

          rolling over <br>

          > nicely. There's more to be done though before we are

          ready to submit <br>

          > anything for review. We've read all the material on

          contributing and <br>

          > will follow those guidelines as best we can. However, if

          anyone can <br>

          > think of a reason why we should do anything differently,

          or if there <br>

          > are particular pitfalls we should be aware of, I would

          greatly <br>

          > appreciate that input.<br>

          ><br>

          > Thanks in advance,<br>

          ><br>

          > -Paul<br>

          ><br>

          ><br>

          <br>

          _______________________________________________<br>

          cfe-dev mailing list<br>

          <a href="mailto:cfe-dev@lists.llvm.org" target="_blank"

            moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br>

          <a

            href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"

            rel="noreferrer" target="_blank" moz-do-not-send="true">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

        </blockquote>

      </div>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Paul Anderson, VP of Engineering, GrammaTech, Inc.

531 Esty St., Ithaca, NY 14850

Tel: +1 607 273-7340 x118; <a class="moz-txt-link-freetext" href="http://www.grammatech.com">http://www.grammatech.com</a> </pre>

  </body>

</html>