[PATCH] D79433: [analyzer] StdLibraryFunctionsChecker: Add summaries for POSIX

Gabor Marton via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 6 07:30:14 PDT 2020


martong added a comment.

In D79433#2022244 <https://reviews.llvm.org/D79433#2022244>, @xazax.hun wrote:

> This is cool!


Hi Gabor, thanks for the reviews! :)

> Some questions:
> 
> 1. Have you reported those bugs to CppCheck devs? It might be useful for us as well, as they can also double-check who is right.

Yes, I just have done that: 
https://github.com/danmar/cppcheck/pull/2628
https://github.com/danmar/cppcheck/pull/2629
https://github.com/danmar/cppcheck/pull/2630
Actually by reporting these bugs I realized that not all have been fixed completely here in this patch, see the latest update.

> 2. This is a really large number of summaries. At this point, I wonder if it would be worth have a separate checker for them (that is utilizing the same infrastructure).

I was thinking about to rename this checker: `StdLibraryFunctionsChecker` -> `LibraryFunctionsChecker`. And we could add a checker option (or a subchecker) to enable these summaries for POSIX.

> 3. Is it worth to have the script you used to do the conversion available for other people?

Well, that is just some hacky-wacky scratch of mine that I did not plan to share. But I am pasting it here in its current state, maybe that has some value for somebody in the future.

  #!/usr/bin/python3
  import subprocess
  import sys
  import xml.etree.ElementTree as ET
  from io import StringIO
  
  
  blacklist = [
      'strchr',
      'ctime',
  ]
  
  whitelist = [
  'select',
  'close',
  'htons',
  'socket',
  'setsockopt',
  ]
  
  
  def decr(nr):
      return str(int(nr) - 1)
  
  
  root = ET.parse(sys.argv[1]).getroot()
  for function in root.findall('function'):
  
      pureness = "NoEvalCall"
      if function.find('pure') is not None:
          pureness = "EvalCallAsPure"
  
      names = function.get('name')
      for name in names.split(","):
  
          if name.endswith("_l"):
              continue
  
          # if name in blacklist:
              # continue
  
          # if name not in whitelist:
              # continue
  
          if name.startswith("std::"):
              continue
  
          old_stdout = sys.stdout
          sys.stdout = mystdout = StringIO()
  
          # Get the signature that is in a comment. Unfortunately comments are
          # skipped from the XML during read.
          sign = subprocess.run(
              ['/bin/grep', str(name) + '(', sys.argv[1]],
              stdout=subprocess.PIPE)
          sign = sign.stdout.decode('utf-8')
          sign = sign.replace(
              '<!--',
              '').replace(
              '-->',
              '').replace(
              '\n',
              '').strip()
          print("// " + sign)
  
          print("addToFunctionSummaryMap(\"" + str(name) +
                # "\",\nSummary(ArgTypes{}, RetType{}, " + pureness + ")")
                "\",\nSummary(" + pureness + ")")
  
          returnValue = function.find('returnValue')
          returnValueConstraint = None
          if returnValue is not None:
              if returnValue.text is not None:  # returnValue constraint
                  returnValueConstraint = returnValue.text
          if returnValueConstraint is not None:
              print(".Case({})".format(returnValue.text))
  
          args = function.findall('arg')
          for arg in args:
              if arg is not None:
                  nr = arg.get('nr')
                  if nr is None or nr == 'any':
                      continue
                  nr = decr(nr)
  
                  notnull = arg.find('not-null')
                  if notnull is not None:
                      print(".ArgConstraint(NotNull(ArgNo(" + nr + ")))")
  
                  valid = arg.find('valid')
                  if valid is not None:
                      if valid.text.endswith(':'):
                          l = valid.text.split(':')
                          print(
                              ".ArgConstraint(ArgumentCondition({}, WithinRange, Range({}, Max)))".format(
                                  nr, l[0]))
                      else:
                          print(
                              ".ArgConstraint(ArgumentCondition({}, WithinRange, Range({})))".format(
                                  nr, valid.text))
  
                  minsize = arg.find('minsize')
                  if minsize is not None:
                      mstype = minsize.get('type')
                      if mstype == 'argvalue':
                          print(
                              ".ArgConstraint(BufferSize({},{}))".format(
                                  nr, decr(minsize.get('arg'))))
                      if mstype == 'mul':
                          print(
                              ".ArgConstraint(BufferSize({},{},{}))".format(
                                  nr, decr(
                                      minsize.get('arg')), decr(
                                      minsize.get('arg2'))))
  
          print(");")
  
          # Print only non-trivial summaries.
          sys.stdout = old_stdout
          if ".ArgConstraint" in mystdout.getvalue() or ".Case" in mystdout.getvalue():
              print(mystdout.getvalue())



> Are there other summaries we might be interested in, like Qt or other popular libraries?

Absolutely. In telecomm we are interested in openssl and libcurl, but the number of functions in the corresponding xmls is quite low compared to Posix.
Also, Posix has a pretty stable standardized interface that is not really a subject of change. However, other libs may change more frequently, so I don't think we could support them similarly to Posix. 
Furthermore, Posix and Libc are essential libs used almost by all C/C++ programs in some form. The other libs are not that common, so I'd rather not add extra compilation time of this checker for supporting them.
So, I think, once we are capable of handling our own format for summaries (with our own Yaml format or by extending API-notes) then should we port those libs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79433/new/

https://reviews.llvm.org/D79433





More information about the cfe-commits mailing list