[PATCH] D139534: [analyzer] Don't escape local static memregions on bind

Balázs Benics via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Dec 21 09:13:34 PST 2022


steakhal added a comment.

Finally, I had some time to come back to this.
Thanks for taking the time for such a detailed response, kudos! @NoQ

In D139534#3999455 <https://reviews.llvm.org/D139534#3999455>, @NoQ wrote:

> Ok screw it, my static example is still broken. There's technically still a leak in it because there's no time for the other code to kick in between `p = (int *)malloc(sizeof(int));` and `p = 0;` to read the value from `p`.
>
> But in practice there could be a lot of things going on in between, that with your patch might not trigger enough escapes. Let me disconnect from the original test case and try to build a few examples more carefully.
>
>   void foo() {
>     static int *p;
>     escape(&p);
>     p = malloc(sizeof(int));
>     free_whatever_escaped();
>     p = 0; // no-leak
>   }
>
> This isn't a leak, regardless of whether the variable is static or not. Currently we display a leak when the variable is non-static (bad) but don't display a leak when the variable is static (good). IIUC your patch might break this. The important part is that in this case there's no direct connection between the tracked pointer `p` and the escaped region `&p` because `p` isn't stored in `&p` until the next line. So if you remove invalidation of `p` on `free_whatever_escaped();`, it'll cause a new false positive in the static case.

Yes, I'd introduce FP for this case.

>   void foo(cond) {
>     static int *p;
>     if (cond) {
>       p = malloc(sizeof(int));
>       free_whatever_escaped();
>       p = 0; // no-leak
>     }
>     escape(&p);
>   }
>
> In this case there's no leak as long as the first invocation of the function always has `cond` set to false. In this case, again, there's no direct connection between `&p` and the return value of `malloc()` when direct escape of `&p` happens. On top of that, the direct escape of `&p` isn't observed at all until much later in the analysis. But just because the local is static, and the function can be called multiple times, the escape at the end of the analysis may "affect" our decision-making at the beginning of the analysis. I suspect your patch breaks this example as well.

Yes, we will report a FP there with my patch.

> So basically in order to do the right thing here, you need to make sure there are no direct escapes of that static variable *anywhere* in the function. But as long as there's no direct escapes, you're probably good to go. With non-static locals you only need to observe escapes *before* the leak (but possibly before the allocation as well), which we currently don't do a good job at anyway.

I agree that we could improve escape handling of static variables by checking if that could have escaped *anywhere*. Although, I don't plan to invest time there.

> Then, again, it's possible that your patch improves FP rates than what we currently have, just by being a different trade-off, given how artificial my examples are, but we'll need some data to figure it out.

According to my differential analysis, I saw only 2 new issues and 1 disappearing issue.
The test corpus includes roughly 170 projects, including really big ones; windows, Linux, old, new, C and C++ projects as well.
Also, note that we are not using the default set of checks, nor the cc1 driver.

The two new issues were both TPs, for memory leaks. I'll elaborate on these later.
I haven't investigated the single disappearing issue, which was about `The right operand of '<' is a garbage value`. In the context, I could see some local static variables though.

---

Here is the gist of one *new* TP:

  // OpenJDK src/java.base/unix/native/libjava/java_props_md.c
  #define NULL ((void*)0)
  typedef unsigned int uid_t;
  extern struct passwd *getpwuid(uid_t uid);
  extern uid_t getuid();
  extern char *strdup(const char *s);
  extern char *getenv(const char *name);
  struct passwd {
    char *pw_dir;
  };
  
  typedef struct {
    char *user_home;
  } java_props_t;
  
  java_props_t *GetJavaProperties(struct JNIEnv *env) {
    static java_props_t sprops;
  
    struct passwd *pwent = getpwuid(getuid());
    sprops.user_home = pwent ? strdup(pwent->pw_dir) : NULL; // taking 'true' branch; allocating memory by 'strdup'
    if (sprops.user_home == NULL || sprops.user_home[0] == '\0' ||
        sprops.user_home[1] == '\0') {
      char *user_home = getenv("HOME");
      if ((user_home != NULL) && (user_home[0] != '\0'))
        sprops.user_home = user_home; // leaking previous `sprops.user_home` !
    }
  }

The other TP in the Samba project looks quite similar but involves a loop complicating things.

---

By looking at the results, I think my patch helps more than hinders, and I'm also not expecting many users severely impacted by this change.



================
Comment at: clang/test/Analysis/malloc-static-storage.cpp:33-38
+void malloc_escape() {
+  static int *p;
+  p = (int *)malloc(sizeof(int));
+  escape(p); // no-leak
+  p = 0; // no-leak
+}
----------------
NoQ wrote:
> NoQ wrote:
> > NoQ wrote:
> > > The main problem with static locals is that this can happen the other way round:
> > > 
> > > ```lang=c
> > > void malloc_escape() {
> > >   static int *p;
> > >   escape(p);
> > >   p = (int *)malloc(sizeof(int));
> > >   p = 0; // no-leak
> > > }
> > > ```
> > Wait, I misread. I'm thinking of a situation like this:
> > ```
> > void malloc_escape() {
> >   static int *p;
> >   escape(&p); // added '&'
> >   p = (int *)malloc(sizeof(int));
> >   p = 0; // no-leak
> > }
> > ```
> Technically this is also a problem with non-static locals if we complicate the situation a little bit:
> 
> ```lang=c
> void malloc_escape() {
>   int *p;
>   escape(&p);
>   p = (int *)malloc(sizeof(int));
>   free_whatever_escaped();
>   p = 0; // currently false leak warning
> }
> ```
> 
> We've had a lovely conversation about this with @xazax.hun in D71041 and D71224 but we've failed to produce a good solution back then.
Awesome xref. I'll read it. Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139534/new/

https://reviews.llvm.org/D139534



More information about the cfe-commits mailing list