[cfe-dev] [RFC] warning about unparenthized macro replacement lists (CERT/MISRA compliance)

Aaron Ballman aaron at aaronballman.com
Thu Apr 23 09:02:37 PDT 2015


On Thu, Apr 16, 2015 at 6:52 AM, Daniel Marjamäki
<Daniel.Marjamaki at evidente.se> wrote:
> Hello!
>
> Here are the statistics right now..
>
> I have scanned ~900 projects. All these projects are in Debian (so they are relatively mature).
>
> I got warnings in 35 projects.
>
> FP:
> 1 lex generated code:                   18 projects
> 2 "*123" macros                         3 projects
> 3 "+1" macros                           1 project
> 4 dangerous macro, no bug               7 projects
> 5 libcap: a->x(y) => a->blk[y] |= ..    1 project
>
> TP:
> 6 brighton: division in macro           1 project
> 7 dangerous macro, mistake              1 project
> 8 real bugs:                            4 projects
>
> there was both TPs and FPs in binutils so this project was counted twice.
>
> In my first email I said there was FP for a cast in binutils. I have personally changed my mind and now classify it as a TP. The cast is only applied to the first operand in the macro and not the result, I doubt that is intentional.
>
> In my first email I said there was FP for a division in brighton as it might prevent some truncation. I have changed my mind about this particular case and classify this as a TP. If the programmer wanted to avoid truncation then some expressions are not well written.
>
> About "lex generated code". It is the exact same case for every project. It is the BEGIN macro it warns about. I don't know how.. but maybe it's possible to write some good heuristic about this.
>
> About "dangerous macro, no bug". I used this classification when the macro is dangerous but the execution order does not really matter. I believe that these are likely TP for many users - I do not think this is by design.
> But it would be relatively safe to skip these warnings when the execution order does not seem to matter.
> Example:
> #define X  2 + 3
> int i = 1 + X;
>
> In the "dangerous macro, mistake" case the programmer has clearly tried to enclose the macro in parentheses:
> #define bcd2dec(x) (((x)&0x70)>>4)*10 + ((x)&0x0F)
> However, one more () is needed. If the "dangerous macro, no bug" false positives are fixed the warning about this macro will also go away I am afraid. It is only used in a addition so there is no danger.
>
> I had thought that there would be more true positives.
>
> I still personally think it would make sense to add this in the compiler if we can deal with the false positives good enough. It does detect real bugs.
>
> But if you think that the clang-tidy would be more appropriate I am open to that also.

I think that this does catch real bugs, and definitely has value. The
lex-generated code has me concerned about the false positive rate
though, since that's a fairly widely-used code generation tool. I
think this may be best implemented within clang-tidy instead of the
compiler, especially given that the FP to TP rate is 5:1 with your
sample set.

Thank you for working on this!

~Aaron

>
> Best regards,
> Daniel Marjamäki
> ..................................................................................................................
> Daniel Marjamäki Senior Engineer
> Evidente ES East AB  Warfvinges väg 34  SE-112 51 Stockholm  Sweden
>
> Mobile:                 +46 (0)709 12 42 62
> E-mail:                 Daniel.Marjamaki at evidente.se
>
> www.evidente.se
>
> ________________________________________
> Från: aaron.ballman at gmail.com [aaron.ballman at gmail.com] för Aaron Ballman [aaron at aaronballman.com]
> Skickat: den 10 april 2015 23:59
> Till: Sean Silva
> Cc: Daniel Marjamäki; cfe-dev Developers
> Ämne: Re: [cfe-dev] [RFC] warning about unparenthized macro replacement lists (CERT/MISRA compliance)
>
> I can see utility in such a diagnostic, but the false positives may be
> problematic. What is the true positive rate that you're finding from
> this check?
>
> I think that having a series of CERT (and/or MISRA) checkers that can
> be enabled as a group would be a fantastic idea, but the design might
> be slightly tricky since those checks are likely to be most naturally
> placed in drastically different parts of the compiler (Sema, Analysis,
> the static analyzer, ubsan, clang-tidy, etc).
>
> ~Aaron
>
> On Thu, Apr 9, 2015 at 6:07 PM, Sean Silva <chisophugis at gmail.com> wrote:
>> This seems like it would have too many false positives and MISRA/CERT is a
>> specific coding standard that not everyone uses. It would probably make
>> sense in a clang-tidy check that enforces MISRA/CERT guidelines.
>>
>> On Thu, Apr 9, 2015 at 4:57 AM, Daniel Marjamäki
>> <Daniel.Marjamaki at evidente.se> wrote:
>>>
>>>
>>>
>>>
>>> Hello!
>>>
>>> I have written a simple compiler warning for unparenthised macro
>>> replacement lists. Such macros violates both MISRA and CERT rules.
>>>
>>> A compiler warning is written when there is dangerous usage.
>>>
>>> For example it writes a warning here:
>>>
>>>     #define A  1+2
>>>     int x = A * 2;  // <- warning
>>>
>>> My intention was to make sure that code is compliant to this CERT rule:
>>>
>>> https://www.securecoding.cert.org/confluence/display/c/PRE02-C.+Macro+replacement+lists+should+be+parenthesized
>>>
>>> However there are some interesting false positives..
>>>
>>> I've tested it on these projects so far:
>>> a52dec_0.7.4 aalib_1.4p5 aaphoto_0.43.1 abcm2ps_7.8.9 abcmidi_20141016
>>> abe_1.1 acct_6.5.5 ace-of-penguins_1.3 achilles_2 acovea_5.1.1 acpi_1.7
>>> acpitool_0.5.1 acr_0.9.9 acr38_1.7.11 adjtimex_1.29 adns_1.5.0~rc1
>>> adolc_2.5.2 adtool_1.3.3 aephea_12-248 aespipe_2.4c aft_5.098 agedu_9723
>>> aggregate_1.6 aiksaurus_1.2.1+dev-0.12 ajaxterm_0.10 ale_0.9.0.3
>>> algol68g_2.8 allegro4.2_4.2.2 alsa-lib_1.0.28 alsaplayer_0.99.81 ample_0.5.7
>>> amule-emc_0.5.2 anthy_9100h antlr_2.7.7 anubis_4.1.1+dfsg1 anyremote_6.4
>>> anyremote2html_1.4 aolserver4_4.5.1
>>> apache-mod-auth-ntlm-winbind_0.0.0.lorikeet+svn+801 apcupsd_3.14.8
>>> aplus-fsf_4.22.1 apparix_07-261 apr_1.5.1 aprsd_2.2.5-13 aprsdigi_3.5.1
>>> apsfilter_7.2.6 aptitude-robot_1.3.4 arc-gui-clients_0.4.6 archmbox_4.10.0
>>> argtable2_9 aria2_1.18.8 arpack++_2.3 ascii2binary_2.14 asciidoc_8.6.9
>>> babl_0.1.12 backintime_1.0.36 backupninja_1.0.1 bar_1.11.1
>>> barcode_0.98+debian bash_4.3 bash-completion_2.1 bashdb_4.2.0.8 bbe_0.2.2
>>> bc_1.06.95 bcpp_0.0.20050725 bdfresize_1.5 beecrypt_4.2.1 bfbtester_2.0.1
>>> bibclean_2.11.4 biblememorizer_0.6.4 bibutils_5.4 biff_0.17.pre20000412
>>> binfmtc_0.17 binutils_2.25 binutils-h8300-hms_2.16.1 bison_3.0.2.dfsg
>>> bison++_1.21.11 bld_0.3.4.1 blitz++_0.9 blktool_4 blm_0.9.3
>>> bluez-hcidump_2.4 bmake_20140620 bmf_0.9.4 bml_0.6.1 boa_0.94.14rc21 bodr_9
>>> boolstuff_0.1.14 bopm_3.1.3 bottlerocket_0.05b3 bristol_0.60.5
>>> brltty_5.2~20141018 bsd-finger_0.17
>>> m17n-docs_1.6.2 m17n-lib_1.6.4 m4_1.4.17 macchanger_1.7.0 mach_0.9.1
>>> madlib_1.3.0 magic_7.5.241 magicfilter_1.2 magicrescue_1.1.9
>>> mailfilter_0.8.3 mailtextbody_0.1.3 mailutils_2.99.98 makebootfat_1.4
>>> makedepf90_2.8.8 makepp_2.0.98.5 maloc_0.2 maq_0.7.1 maradns_2.0.09
>>> marisa_0.2.4 matanza_0.13+ds1 matchbox-themes-extra_0.3 maude_2.6
>>> mauve_20140821 mboxgrep_0.7.9 mbr_1.1.11 mbuffer_20140310 mcl_14-137
>>> mcpp_2.7.2 md5deep_4.3 mdds_0.5.4 mdf2iso_0.3.1 mecab_0.996
>>> medicalterms_20110608 medusa_2.1.1 mell_1.0.0 mergelog_4.5.1 meschach_1.2b
>>> metastudent-data_1.0.1 metastudent-data-2_1.0.0 mew_6.6
>>> mew-beta_7.0.50~6.6+0.20140902 mhash_0.9.9.9 mig_1.4 mimetic_0.9.8
>>> min12xxw_0.0.9 mined_2000.15.4 minicom_2.7 minidjvu_0.8.svn.2010.05.06+dfsg
>>> minisapserver_0.3.6 miredo_1.2.6 miscfiles_1.5+dfsg missidentify_1.0 mkcue_1
>>> mkelfimage_2.7 mlmmj_1.2.18.1 mlocate_0.26 mlt_0.9.6 mm_1.4.2
>>> mm-common_0.9.7 mmdb_1.25.5 mmorph_2.3.4.2 moap_0.2.7
>>> mobile-broadband-provider-info_20140317 mod-wsgi_4.3.0
>>> module-init-tools_3.12 mona_1.4-15 monitoring-plugins_2.1.1
>>> oath-toolkit_2.4.1 ocaml_4.02.1 ocl-icd_2.2.3 ocrad_0.24
>>> octave-benchmark_1.1.1 octave-bioinfo_0.1.2 octave-combinatorics_1.0.9
>>> octave-fixed_0.7.10 octave-ga_0.9.7 octave-ident_1.0.7
>>> octave-informationtheory_0.1.8 octave-integration_1.0.7 octave-irsa_1.0.7
>>> octave-mapping_1.0.7 octave-missing-functions_1.0.2 octave-multicore_0.2.15
>>> octave-nlwing2_1.1.1 octave-outliers_0.13.9 octave-pdb_1.0.7
>>> octave-physicalconstants_0.1.7 octave-secs2d_0.0.8 octave-simp_1.1.0
>>> octave-struct_1.0.7 octave-symband_1.0.10 octave-time_1.0.9
>>> octave-xraylib_1.0.8 octave-zenity_0.5.7 ocurl_0.5.3 ode_0.11.1 odyssey_0.4
>>> oggvideotools_0.8a ogmtools_1.5 oidentd_2.0.8 omnievents_2.6.2
>>> omniorb-dfsg_4.2.0 onak_0.4.5 onioncat_0.2.2+svn559 op_1.32
>>> opari2_1.0.7+dfsg opencore-amr_0.1.3 opendchub_0.8.2 openfst_1.3.3
>>> openh323_1.18.0.dfsg openload_0.1.2 openmama_2.2.2.1 openmpi_1.6.5
>>> openmsx_0.8.2 openntpd_5.7p3 openocd_0.8.0 openresolv_3.5.2
>>> openslp-dfsg_1.2.1
>>>
>>> I have looked at some of the warnings....
>>>
>>>
>>> There was warnings in abcm2ps_7.8.9. Example code:
>>>
>>>     #define CM        * 28.35 /* factor to transform cm to pt */
>>>     ...
>>>     f->topmargin = 1.0 CM;
>>>     ...
>>>
>>> => I believe the macro usage is intentional here. So it's likely a FP from
>>> the user perspective.
>>>
>>>
>>> There was warnings in bc_1.06.95. There was code generated by flex like:
>>>
>>>     #define BEGIN (yy_start) = 1 + 2 *
>>>     ..
>>>     BEGIN(slcomment);
>>>
>>> => imho stylistically this is code from hell but it's autogenerated. I
>>> guess users have to disable various warnings for autogenerated code?
>>>
>>>
>>> In binutils there was such code:
>>>
>>>     #define ILF_DATA_SIZE      + SIZEOF_ILF_SYMS + ...
>>>     ...
>>>     ptr = (bfd_byte *) bfd_zmalloc ((bfd_size_type) ILF_DATA_SIZE);
>>>     vars.bim->buffer = ptr;
>>>     vars.bim->size   = ILF_DATA_SIZE;
>>>
>>> => It would not hurt to add parentheses to the macro. For information,
>>> bfd_size_type is a type.
>>>
>>>
>>> In brighton there was this code:
>>>
>>>     #define K6W 1000 / 43
>>>     ..
>>>     {"", 2, 2 * K6W, KIH, K6W, KIL, 0, 1, 0,
>>>
>>> => hmm.. might be intended. If parentheses are used there is more
>>> truncation.
>>>
>>>
>>> About abcm2ps_7.8.9 and brighton. The warnings are technically correct -
>>> the code violates MISRA/CERT rules. A user who wants his code to be
>>> compliant to the CERT rule would want to have these warnings. But I also
>>> understand that in these projects these warnings would be seen as FP. Do you
>>> think these are FP I should fix?
>>>
>>> Technical question:
>>> My checker has false negatives when the dangerous macro usage is inside a
>>> macro.
>>> Currently my checker uses isMacroID(). It only tells me if an expression
>>> comes from some macro or not. I want to know if two expressions come from
>>> different macros.
>>>
>>> Best regards,
>>> Daniel Marjamäki
>>>
>>>
>>>
>>> ..................................................................................................................
>>> Daniel Marjamäki Senior Engineer
>>> Evidente ES East AB  Warfvinges väg 34  SE-112 51 Stockholm  Sweden
>>>
>>> Mobile:                 +46 (0)709 12 42 62
>>> E-mail:                 Daniel.Marjamaki at evidente.se
>>>
>>> www.evidente.se
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>




More information about the cfe-dev mailing list