[cfe-dev] [RFC] warning about unparenthized macro replacement lists (CERT/MISRA compliance)

Aaron Ballman aaron at aaronballman.com
Fri Apr 24 07:06:06 PDT 2015


On Fri, Apr 24, 2015 at 7:23 AM, Daniel Marjamäki
<Daniel.Marjamaki at evidente.se> wrote:
>
> Hello!
>
> I have tried to eliminate FPs with heuristics.
>
> I now get warnings in 13 projects
>
> FP
> lex generated code:                        0 (18)
> "*123" and "+1" macros                  4 (4)
> dangerous macro, no bug               3 (7)
> libcap: a->x(y) => a->blk[y] |= ..      1 (1)
>
> TP:
> brighton: division in macro              1 (1)
> dangerous macro, mistake              0 (1)
> real bugs:                                        4 (4)
>
> The heuristic that removed the LEX FPs is that I don't warn when the last token in the replacement list is a MUL.

This also reduces the true positive rate for the dangerous macro
that's a mistake, which is unfortunate. I would probably add this
check to clang-tidy, leave the heuristic out, and then tighten if we
find the FP rate is too high for users. With the original numbers, I
don't think the FP rate was unreasonable for clang-tidy (others may
have different opinions though).

~Aaron

> It should be possible to improve the heuristics further to avoid more FPs.
>
> But I will look into putting this in clang-tidy also.
>
> Best regards,
> Daniel Marjamäki
>
> ..................................................................................................................
> Daniel Marjamäki Senior Engineer
> Evidente ES East AB  Warfvinges väg 34  SE-112 51 Stockholm  Sweden
>
> Mobile:                 +46 (0)709 12 42 62
> E-mail:                 Daniel.Marjamaki at evidente.se
>
> www.evidente.se
>
> ________________________________________
> Från: aaron.ballman at gmail.com [aaron.ballman at gmail.com] för Aaron Ballman [aaron at aaronballman.com]
> Skickat: den 23 april 2015 18:02
> Till: Daniel Marjamäki
> Cc: Sean Silva; cfe-dev Developers
> Ämne: Re: [cfe-dev] [RFC] warning about unparenthized macro replacement lists (CERT/MISRA compliance)
>
> On Thu, Apr 16, 2015 at 6:52 AM, Daniel Marjamäki
> <Daniel.Marjamaki at evidente.se> wrote:
>> Hello!
>>
>> Here are the statistics right now..
>>
>> I have scanned ~900 projects. All these projects are in Debian (so they are relatively mature).
>>
>> I got warnings in 35 projects.
>>
>> FP:
>> 1 lex generated code:                   18 projects
>> 2 "*123" macros                         3 projects
>> 3 "+1" macros                           1 project
>> 4 dangerous macro, no bug               7 projects
>> 5 libcap: a->x(y) => a->blk[y] |= ..    1 project
>>
>> TP:
>> 6 brighton: division in macro           1 project
>> 7 dangerous macro, mistake              1 project
>> 8 real bugs:                            4 projects
>>
>> there was both TPs and FPs in binutils so this project was counted twice.
>>
>> In my first email I said there was FP for a cast in binutils. I have personally changed my mind and now classify it as a TP. The cast is only applied to the first operand in the macro and not the result, I doubt that is intentional.
>>
>> In my first email I said there was FP for a division in brighton as it might prevent some truncation. I have changed my mind about this particular case and classify this as a TP. If the programmer wanted to avoid truncation then some expressions are not well written.
>>
>> About "lex generated code". It is the exact same case for every project. It is the BEGIN macro it warns about. I don't know how.. but maybe it's possible to write some good heuristic about this.
>>
>> About "dangerous macro, no bug". I used this classification when the macro is dangerous but the execution order does not really matter. I believe that these are likely TP for many users - I do not think this is by design.
>> But it would be relatively safe to skip these warnings when the execution order does not seem to matter.
>> Example:
>> #define X  2 + 3
>> int i = 1 + X;
>>
>> In the "dangerous macro, mistake" case the programmer has clearly tried to enclose the macro in parentheses:
>> #define bcd2dec(x) (((x)&0x70)>>4)*10 + ((x)&0x0F)
>> However, one more () is needed. If the "dangerous macro, no bug" false positives are fixed the warning about this macro will also go away I am afraid. It is only used in a addition so there is no danger.
>>
>> I had thought that there would be more true positives.
>>
>> I still personally think it would make sense to add this in the compiler if we can deal with the false positives good enough. It does detect real bugs.
>>
>> But if you think that the clang-tidy would be more appropriate I am open to that also.
>
> I think that this does catch real bugs, and definitely has value. The
> lex-generated code has me concerned about the false positive rate
> though, since that's a fairly widely-used code generation tool. I
> think this may be best implemented within clang-tidy instead of the
> compiler, especially given that the FP to TP rate is 5:1 with your
> sample set.
>
> Thank you for working on this!
>
> ~Aaron
>
>>
>> Best regards,
>> Daniel Marjamäki
>> ..................................................................................................................
>> Daniel Marjamäki Senior Engineer
>> Evidente ES East AB  Warfvinges väg 34  SE-112 51 Stockholm  Sweden
>>
>> Mobile:                 +46 (0)709 12 42 62
>> E-mail:                 Daniel.Marjamaki at evidente.se
>>
>> www.evidente.se
>>
>> ________________________________________
>> Från: aaron.ballman at gmail.com [aaron.ballman at gmail.com] för Aaron Ballman [aaron at aaronballman.com]
>> Skickat: den 10 april 2015 23:59
>> Till: Sean Silva
>> Cc: Daniel Marjamäki; cfe-dev Developers
>> Ämne: Re: [cfe-dev] [RFC] warning about unparenthized macro replacement lists (CERT/MISRA compliance)
>>
>> I can see utility in such a diagnostic, but the false positives may be
>> problematic. What is the true positive rate that you're finding from
>> this check?
>>
>> I think that having a series of CERT (and/or MISRA) checkers that can
>> be enabled as a group would be a fantastic idea, but the design might
>> be slightly tricky since those checks are likely to be most naturally
>> placed in drastically different parts of the compiler (Sema, Analysis,
>> the static analyzer, ubsan, clang-tidy, etc).
>>
>> ~Aaron
>>
>> On Thu, Apr 9, 2015 at 6:07 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>> This seems like it would have too many false positives and MISRA/CERT is a
>>> specific coding standard that not everyone uses. It would probably make
>>> sense in a clang-tidy check that enforces MISRA/CERT guidelines.
>>>
>>> On Thu, Apr 9, 2015 at 4:57 AM, Daniel Marjamäki
>>> <Daniel.Marjamaki at evidente.se> wrote:
>>>>
>>>>
>>>>
>>>>
>>>> Hello!
>>>>
>>>> I have written a simple compiler warning for unparenthised macro
>>>> replacement lists. Such macros violates both MISRA and CERT rules.
>>>>
>>>> A compiler warning is written when there is dangerous usage.
>>>>
>>>> For example it writes a warning here:
>>>>
>>>>     #define A  1+2
>>>>     int x = A * 2;  // <- warning
>>>>
>>>> My intention was to make sure that code is compliant to this CERT rule:
>>>>
>>>> https://www.securecoding.cert.org/confluence/display/c/PRE02-C.+Macro+replacement+lists+should+be+parenthesized
>>>>
>>>> However there are some interesting false positives..
>>>>
>>>> I've tested it on these projects so far:
>>>> a52dec_0.7.4 aalib_1.4p5 aaphoto_0.43.1 abcm2ps_7.8.9 abcmidi_20141016
>>>> abe_1.1 acct_6.5.5 ace-of-penguins_1.3 achilles_2 acovea_5.1.1 acpi_1.7
>>>> acpitool_0.5.1 acr_0.9.9 acr38_1.7.11 adjtimex_1.29 adns_1.5.0~rc1
>>>> adolc_2.5.2 adtool_1.3.3 aephea_12-248 aespipe_2.4c aft_5.098 agedu_9723
>>>> aggregate_1.6 aiksaurus_1.2.1+dev-0.12 ajaxterm_0.10 ale_0.9.0.3
>>>> algol68g_2.8 allegro4.2_4.2.2 alsa-lib_1.0.28 alsaplayer_0.99.81 ample_0.5.7
>>>> amule-emc_0.5.2 anthy_9100h antlr_2.7.7 anubis_4.1.1+dfsg1 anyremote_6.4
>>>> anyremote2html_1.4 aolserver4_4.5.1
>>>> apache-mod-auth-ntlm-winbind_0.0.0.lorikeet+svn+801 apcupsd_3.14.8
>>>> aplus-fsf_4.22.1 apparix_07-261 apr_1.5.1 aprsd_2.2.5-13 aprsdigi_3.5.1
>>>> apsfilter_7.2.6 aptitude-robot_1.3.4 arc-gui-clients_0.4.6 archmbox_4.10.0
>>>> argtable2_9 aria2_1.18.8 arpack++_2.3 ascii2binary_2.14 asciidoc_8.6.9
>>>> babl_0.1.12 backintime_1.0.36 backupninja_1.0.1 bar_1.11.1
>>>> barcode_0.98+debian bash_4.3 bash-completion_2.1 bashdb_4.2.0.8 bbe_0.2.2
>>>> bc_1.06.95 bcpp_0.0.20050725 bdfresize_1.5 beecrypt_4.2.1 bfbtester_2.0.1
>>>> bibclean_2.11.4 biblememorizer_0.6.4 bibutils_5.4 biff_0.17.pre20000412
>>>> binfmtc_0.17 binutils_2.25 binutils-h8300-hms_2.16.1 bison_3.0.2.dfsg
>>>> bison++_1.21.11 bld_0.3.4.1 blitz++_0.9 blktool_4 blm_0.9.3
>>>> bluez-hcidump_2.4 bmake_20140620 bmf_0.9.4 bml_0.6.1 boa_0.94.14rc21 bodr_9
>>>> boolstuff_0.1.14 bopm_3.1.3 bottlerocket_0.05b3 bristol_0.60.5
>>>> brltty_5.2~20141018 bsd-finger_0.17
>>>> m17n-docs_1.6.2 m17n-lib_1.6.4 m4_1.4.17 macchanger_1.7.0 mach_0.9.1
>>>> madlib_1.3.0 magic_7.5.241 magicfilter_1.2 magicrescue_1.1.9
>>>> mailfilter_0.8.3 mailtextbody_0.1.3 mailutils_2.99.98 makebootfat_1.4
>>>> makedepf90_2.8.8 makepp_2.0.98.5 maloc_0.2 maq_0.7.1 maradns_2.0.09
>>>> marisa_0.2.4 matanza_0.13+ds1 matchbox-themes-extra_0.3 maude_2.6
>>>> mauve_20140821 mboxgrep_0.7.9 mbr_1.1.11 mbuffer_20140310 mcl_14-137
>>>> mcpp_2.7.2 md5deep_4.3 mdds_0.5.4 mdf2iso_0.3.1 mecab_0.996
>>>> medicalterms_20110608 medusa_2.1.1 mell_1.0.0 mergelog_4.5.1 meschach_1.2b
>>>> metastudent-data_1.0.1 metastudent-data-2_1.0.0 mew_6.6
>>>> mew-beta_7.0.50~6.6+0.20140902 mhash_0.9.9.9 mig_1.4 mimetic_0.9.8
>>>> min12xxw_0.0.9 mined_2000.15.4 minicom_2.7 minidjvu_0.8.svn.2010.05.06+dfsg
>>>> minisapserver_0.3.6 miredo_1.2.6 miscfiles_1.5+dfsg missidentify_1.0 mkcue_1
>>>> mkelfimage_2.7 mlmmj_1.2.18.1 mlocate_0.26 mlt_0.9.6 mm_1.4.2
>>>> mm-common_0.9.7 mmdb_1.25.5 mmorph_2.3.4.2 moap_0.2.7
>>>> mobile-broadband-provider-info_20140317 mod-wsgi_4.3.0
>>>> module-init-tools_3.12 mona_1.4-15 monitoring-plugins_2.1.1
>>>> oath-toolkit_2.4.1 ocaml_4.02.1 ocl-icd_2.2.3 ocrad_0.24
>>>> octave-benchmark_1.1.1 octave-bioinfo_0.1.2 octave-combinatorics_1.0.9
>>>> octave-fixed_0.7.10 octave-ga_0.9.7 octave-ident_1.0.7
>>>> octave-informationtheory_0.1.8 octave-integration_1.0.7 octave-irsa_1.0.7
>>>> octave-mapping_1.0.7 octave-missing-functions_1.0.2 octave-multicore_0.2.15
>>>> octave-nlwing2_1.1.1 octave-outliers_0.13.9 octave-pdb_1.0.7
>>>> octave-physicalconstants_0.1.7 octave-secs2d_0.0.8 octave-simp_1.1.0
>>>> octave-struct_1.0.7 octave-symband_1.0.10 octave-time_1.0.9
>>>> octave-xraylib_1.0.8 octave-zenity_0.5.7 ocurl_0.5.3 ode_0.11.1 odyssey_0.4
>>>> oggvideotools_0.8a ogmtools_1.5 oidentd_2.0.8 omnievents_2.6.2
>>>> omniorb-dfsg_4.2.0 onak_0.4.5 onioncat_0.2.2+svn559 op_1.32
>>>> opari2_1.0.7+dfsg opencore-amr_0.1.3 opendchub_0.8.2 openfst_1.3.3
>>>> openh323_1.18.0.dfsg openload_0.1.2 openmama_2.2.2.1 openmpi_1.6.5
>>>> openmsx_0.8.2 openntpd_5.7p3 openocd_0.8.0 openresolv_3.5.2
>>>> openslp-dfsg_1.2.1
>>>>
>>>> I have looked at some of the warnings....
>>>>
>>>>
>>>> There was warnings in abcm2ps_7.8.9. Example code:
>>>>
>>>>     #define CM        * 28.35 /* factor to transform cm to pt */
>>>>     ...
>>>>     f->topmargin = 1.0 CM;
>>>>     ...
>>>>
>>>> => I believe the macro usage is intentional here. So it's likely a FP from
>>>> the user perspective.
>>>>
>>>>
>>>> There was warnings in bc_1.06.95. There was code generated by flex like:
>>>>
>>>>     #define BEGIN (yy_start) = 1 + 2 *
>>>>     ..
>>>>     BEGIN(slcomment);
>>>>
>>>> => imho stylistically this is code from hell but it's autogenerated. I
>>>> guess users have to disable various warnings for autogenerated code?
>>>>
>>>>
>>>> In binutils there was such code:
>>>>
>>>>     #define ILF_DATA_SIZE      + SIZEOF_ILF_SYMS + ...
>>>>     ...
>>>>     ptr = (bfd_byte *) bfd_zmalloc ((bfd_size_type) ILF_DATA_SIZE);
>>>>     vars.bim->buffer = ptr;
>>>>     vars.bim->size   = ILF_DATA_SIZE;
>>>>
>>>> => It would not hurt to add parentheses to the macro. For information,
>>>> bfd_size_type is a type.
>>>>
>>>>
>>>> In brighton there was this code:
>>>>
>>>>     #define K6W 1000 / 43
>>>>     ..
>>>>     {"", 2, 2 * K6W, KIH, K6W, KIL, 0, 1, 0,
>>>>
>>>> => hmm.. might be intended. If parentheses are used there is more
>>>> truncation.
>>>>
>>>>
>>>> About abcm2ps_7.8.9 and brighton. The warnings are technically correct -
>>>> the code violates MISRA/CERT rules. A user who wants his code to be
>>>> compliant to the CERT rule would want to have these warnings. But I also
>>>> understand that in these projects these warnings would be seen as FP. Do you
>>>> think these are FP I should fix?
>>>>
>>>> Technical question:
>>>> My checker has false negatives when the dangerous macro usage is inside a
>>>> macro.
>>>> Currently my checker uses isMacroID(). It only tells me if an expression
>>>> comes from some macro or not. I want to know if two expressions come from
>>>> different macros.
>>>>
>>>> Best regards,
>>>> Daniel Marjamäki
>>>>
>>>>
>>>>
>>>> ..................................................................................................................
>>>> Daniel Marjamäki Senior Engineer
>>>> Evidente ES East AB  Warfvinges väg 34  SE-112 51 Stockholm  Sweden
>>>>
>>>> Mobile:                 +46 (0)709 12 42 62
>>>> E-mail:                 Daniel.Marjamaki at evidente.se
>>>>
>>>> www.evidente.se
>>>>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>




More information about the cfe-dev mailing list