[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)

Malavika Samak via cfe-commits cfe-commits at lists.llvm.org
Tue Aug 13 14:41:15 PDT 2024


https://github.com/malavikasamak updated https://github.com/llvm/llvm-project/pull/101585

>From c9f2b131aea2c0d9d1405cb00c54dde859750d0c Mon Sep 17 00:00:00 2001
From: MalavikaSamak <malavika2 at apple.com>
Date: Thu, 1 Aug 2024 11:01:36 -0700
Subject: [PATCH] [attributes][-Wunsafe-buffer-usage] Support adding
 unsafe_buffer_usage attribute to struct fields.

Extend unsafe_buffer_usage attribute to support struct fields.

Adding this attribute to the fields causes the compiler to issue a warning at every access site
of the field. The attribute should be attached to struct fields to inform the clients of the struct
that the warned fields are prone OOB accesses.
---
 clang/include/clang/Basic/Attr.td             |   2 +-
 clang/include/clang/Basic/AttrDocs.td         | 136 ++++++++-----
 .../clang/Basic/DiagnosticSemaKinds.td        |   3 +-
 clang/lib/Analysis/UnsafeBufferUsage.cpp      |  13 +-
 clang/lib/Sema/AnalysisBasedWarnings.cpp      |  14 +-
 ...a-attribute-supported-attributes-list.test |   2 +-
 .../warn-unsafe-buffer-usage-field-attr.cpp   | 190 ++++++++++++++++++
 7 files changed, 297 insertions(+), 63 deletions(-)
 create mode 100644 clang/test/SemaCXX/warn-unsafe-buffer-usage-field-attr.cpp

diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 4825979a974d22..2cc21d67ddffb2 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -4447,7 +4447,7 @@ def ReleaseHandle : InheritableParamAttr {
 
 def UnsafeBufferUsage : InheritableAttr {
   let Spellings = [Clang<"unsafe_buffer_usage">];
-  let Subjects = SubjectList<[Function]>;
+  let Subjects = SubjectList<[Function, Field]>;
   let Documentation = [UnsafeBufferUsageDocs];
 }
 
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 99738812c81579..3cc6edad98599d 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -6764,77 +6764,103 @@ def UnsafeBufferUsageDocs : Documentation {
   let Category = DocCatFunction;
   let Content = [{
 The attribute ``[[clang::unsafe_buffer_usage]]`` should be placed on functions
-that need to be avoided as they are prone to buffer overflows. It is designed to
-work together with the off-by-default compiler warning ``-Wunsafe-buffer-usage``
-to help codebases transition away from raw pointer based buffer management,
-in favor of safer abstractions such as C++20 ``std::span``. The attribute causes
-``-Wunsafe-buffer-usage`` to warn on every use of the function, and it may
-enable ``-Wunsafe-buffer-usage`` to emit automatic fix-it hints
-which would help the user replace such unsafe functions with safe
+that need to be avoided as they are prone to buffer overflows or unsafe buffer
+struct fields. It is designed to work together with the off-by-default compiler
+warning ``-Wunsafe-buffer-usage``to help codebases transition away from raw pointer
+based buffer management, in favor of safer abstractions such as C++20 ``std::span``.
+The attribute causes ``-Wunsafe-buffer-usage`` to warn on every use of the function or
+the field it is attached to, and it may also lead to emission of automatic fix-it
+hints which would help the user replace the use of unsafe functions(/fields) with safe
 alternatives, though the attribute can be used even when the fix can't be automated.
 
-The attribute does not suppress ``-Wunsafe-buffer-usage`` inside the function
-to which it is attached. These warnings still need to be addressed.
+* Attribute attached to functions: The attribute does not suppress 
+  ``-Wunsafe-buffer-usage`` inside the function to which it is attached. 
+  These warnings still need to be addressed.
 
-The attribute is warranted even if the only way a function can overflow
-the buffer is by violating the function's preconditions. For example, it
-would make sense to put the attribute on function ``foo()`` below because
-passing an incorrect size parameter would cause a buffer overflow:
+  The attribute is warranted even if the only way a function can overflow
+  the buffer is by violating the function's preconditions. For example, it
+  would make sense to put the attribute on function ``foo()`` below because
+  passing an incorrect size parameter would cause a buffer overflow:
 
-.. code-block:: c++
+  .. code-block:: c++
 
-  [[clang::unsafe_buffer_usage]]
-  void foo(int *buf, size_t size) {
-    for (size_t i = 0; i < size; ++i) {
-      buf[i] = i;
+    [[clang::unsafe_buffer_usage]]
+    void foo(int *buf, size_t size) {
+      for (size_t i = 0; i < size; ++i) {
+        buf[i] = i;
+      }
     }
-  }
 
-The attribute is NOT warranted when the function uses safe abstractions,
-assuming that these abstractions weren't misused outside the function.
-For example, function ``bar()`` below doesn't need the attribute,
-because assuming that the container ``buf`` is well-formed (has size that
-fits the original buffer it refers to), overflow cannot occur:
+  The attribute is NOT warranted when the function uses safe abstractions,
+  assuming that these abstractions weren't misused outside the function.
+  For example, function ``bar()`` below doesn't need the attribute,
+  because assuming that the container ``buf`` is well-formed (has size that
+  fits the original buffer it refers to), overflow cannot occur:
 
-.. code-block:: c++
+  .. code-block:: c++
 
-  void bar(std::span<int> buf) {
-    for (size_t i = 0; i < buf.size(); ++i) {
-      buf[i] = i;
+    void bar(std::span<int> buf) {
+      for (size_t i = 0; i < buf.size(); ++i) {
+        buf[i] = i;
+      }
     }
-  }
 
-In this case function ``bar()`` enables the user to keep the buffer
-"containerized" in a span for as long as possible. On the other hand,
-Function ``foo()`` in the previous example may have internal
-consistency, but by accepting a raw buffer it requires the user to unwrap
-their span, which is undesirable according to the programming model
-behind ``-Wunsafe-buffer-usage``.
+  In this case function ``bar()`` enables the user to keep the buffer
+  "containerized" in a span for as long as possible. On the other hand,
+  Function ``foo()`` in the previous example may have internal
+  consistency, but by accepting a raw buffer it requires the user to unwrap
+  their span, which is undesirable according to the programming model
+  behind ``-Wunsafe-buffer-usage``.
 
-The attribute is warranted when a function accepts a raw buffer only to
-immediately put it into a span:
+  The attribute is warranted when a function accepts a raw buffer only to
+  immediately put it into a span:
 
-.. code-block:: c++
+  .. code-block:: c++
 
-  [[clang::unsafe_buffer_usage]]
-  void baz(int *buf, size_t size) {
-    std::span<int> sp{ buf, size };
-    for (size_t i = 0; i < sp.size(); ++i) {
-      sp[i] = i;
+    [[clang::unsafe_buffer_usage]]
+    void baz(int *buf, size_t size) {
+      std::span<int> sp{ buf, size };
+      for (size_t i = 0; i < sp.size(); ++i) {
+        sp[i] = i;
+      }
     }
-  }
 
-In this case ``baz()`` does not contain any unsafe operations, but the awkward
-parameter type causes the caller to unwrap the span unnecessarily.
-Note that regardless of the attribute, code inside ``baz()`` isn't flagged
-by ``-Wunsafe-buffer-usage`` as unsafe. It is definitely undesirable,
-but if ``baz()`` is on an API surface, there is no way to improve it
-to make it as safe as ``bar()`` without breaking the source and binary
-compatibility with existing users of the function. In such cases
-the proper solution would be to create a different function (possibly
-an overload of ``baz()``) that accepts a safe container like ``bar()``,
-and then use the attribute on the original ``baz()`` to help the users
-update their code to use the new function.
+  In this case ``baz()`` does not contain any unsafe operations, but the awkward
+  parameter type causes the caller to unwrap the span unnecessarily.
+  Note that regardless of the attribute, code inside ``baz()`` isn't flagged
+  by ``-Wunsafe-buffer-usage`` as unsafe. It is definitely undesirable,
+  but if ``baz()`` is on an API surface, there is no way to improve it
+  to make it as safe as ``bar()`` without breaking the source and binary
+  compatibility with existing users of the function. In such cases
+  the proper solution would be to create a different function (possibly
+  an overload of ``baz()``) that accepts a safe container like ``bar()``,
+  and then use the attribute on the original ``baz()`` to help the users
+  update their code to use the new function.
+
+* Attribute attached to fields: The attribute should only be attached to 
+  struct fields, if the fields can not be updated to a safe type with bounds 
+  check, such as std::span. In other words, the buffers prone to unsafe accesses 
+  should always be updated to use safe containers/views and attaching the attribute 
+  must be last resort when such an update is infeasible.
+
+  The attribute can be placed on individual fields or a set of them as shown below.
+
+  .. code-block:: c++
+
+    struct A {
+      [[clang::unsafe_buffer_usage]]
+      int *ptr1;
+
+      [[clang::unsafe_buffer_usage]]
+      int *ptr2, buf[10];
+
+      [[clang::unsafe_buffer_usage]]
+      size_t sz;
+    };
+
+  Here, every read/write to the fields ptr1, ptr2, buf and sz will trigger a warning 
+  that the field has been explcitly marked as unsafe due to unsafe-buffer operations.
+
   }];
 }
 
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 810abe4f23e31e..b0428d28667a51 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -12371,7 +12371,8 @@ def warn_unsafe_buffer_variable : Warning<
   InGroup<UnsafeBufferUsage>, DefaultIgnore;
 def warn_unsafe_buffer_operation : Warning<
   "%select{unsafe pointer operation|unsafe pointer arithmetic|"
-  "unsafe buffer access|function introduces unsafe buffer manipulation|unsafe invocation of span::data}0">,
+  "unsafe buffer access|function introduces unsafe buffer manipulation|unsafe invocation of span::data|"
+  "field %1 prone to unsafe buffer manipulation}0">,
   InGroup<UnsafeBufferUsage>, DefaultIgnore;
 def note_unsafe_buffer_operation : Note<
   "used%select{| in pointer arithmetic| in buffer access}0 here">;
diff --git a/clang/lib/Analysis/UnsafeBufferUsage.cpp b/clang/lib/Analysis/UnsafeBufferUsage.cpp
index 866222380974b6..051381edabf0b2 100644
--- a/clang/lib/Analysis/UnsafeBufferUsage.cpp
+++ b/clang/lib/Analysis/UnsafeBufferUsage.cpp
@@ -926,22 +926,27 @@ class CArrayToPtrAssignmentGadget : public FixableGadget {
 /// A call of a function or method that performs unchecked buffer operations
 /// over one of its pointer parameters.
 class UnsafeBufferUsageAttrGadget : public WarningGadget {
-  constexpr static const char *const OpTag = "call_expr";
-  const CallExpr *Op;
+  constexpr static const char *const OpTag = "attr_expr";
+  const Expr *Op;
 
 public:
   UnsafeBufferUsageAttrGadget(const MatchFinder::MatchResult &Result)
       : WarningGadget(Kind::UnsafeBufferUsageAttr),
-        Op(Result.Nodes.getNodeAs<CallExpr>(OpTag)) {}
+        Op(Result.Nodes.getNodeAs<Expr>(OpTag)) {}
 
   static bool classof(const Gadget *G) {
     return G->getKind() == Kind::UnsafeBufferUsageAttr;
   }
 
   static Matcher matcher() {
+    auto HasUnsafeFieldDecl =
+        member(fieldDecl(hasAttr(attr::UnsafeBufferUsage)));
+
     auto HasUnsafeFnDecl =
         callee(functionDecl(hasAttr(attr::UnsafeBufferUsage)));
-    return stmt(callExpr(HasUnsafeFnDecl).bind(OpTag));
+
+    return stmt(anyOf(callExpr(HasUnsafeFnDecl).bind(OpTag),
+                      memberExpr(HasUnsafeFieldDecl).bind(OpTag)));
   }
 
   void handleUnsafeOperation(UnsafeBufferUsageHandler &Handler,
diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp b/clang/lib/Sema/AnalysisBasedWarnings.cpp
index 0f604c61fa3af9..e6ce89dc7ec406 100644
--- a/clang/lib/Sema/AnalysisBasedWarnings.cpp
+++ b/clang/lib/Sema/AnalysisBasedWarnings.cpp
@@ -2231,6 +2231,7 @@ class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler {
     SourceLocation Loc;
     SourceRange Range;
     unsigned MsgParam = 0;
+    NamedDecl *D = nullptr;
     if (const auto *ASE = dyn_cast<ArraySubscriptExpr>(Operation)) {
       Loc = ASE->getBase()->getExprLoc();
       Range = ASE->getBase()->getSourceRange();
@@ -2261,6 +2262,12 @@ class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler {
         // note_unsafe_buffer_operation doesn't have this mode yet.
         assert(!IsRelatedToDecl && "Not implemented yet!");
         MsgParam = 3;
+      } else if (isa<MemberExpr>(Operation)) {
+        // note_unsafe_buffer_operation doesn't have this mode yet.
+        assert(!IsRelatedToDecl && "Not implemented yet!");
+        auto ME = dyn_cast<MemberExpr>(Operation);
+        D = ME->getMemberDecl();
+        MsgParam = 5;
       } else if (const auto *ECE = dyn_cast<ExplicitCastExpr>(Operation)) {
         QualType destType = ECE->getType();
         if (!isa<PointerType>(destType))
@@ -2285,7 +2292,12 @@ class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler {
              "Variables blamed for unsafe buffer usage without suggestions!");
       S.Diag(Loc, diag::note_unsafe_buffer_operation) << MsgParam << Range;
     } else {
-      S.Diag(Loc, diag::warn_unsafe_buffer_operation) << MsgParam << Range;
+      if (D) {
+        S.Diag(Loc, diag::warn_unsafe_buffer_operation)
+            << MsgParam << D << Range;
+      } else {
+        S.Diag(Loc, diag::warn_unsafe_buffer_operation) << MsgParam << Range;
+      }
       if (SuggestSuggestions) {
         S.Diag(Loc, diag::note_safe_buffer_usage_suggestions_disabled);
       }
diff --git a/clang/test/Misc/pragma-attribute-supported-attributes-list.test b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
index 33f9c2f51363c6..e0b86faf113b69 100644
--- a/clang/test/Misc/pragma-attribute-supported-attributes-list.test
+++ b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
@@ -200,7 +200,7 @@
 // CHECK-NEXT: TestTypestate (SubjectMatchRule_function_is_member)
 // CHECK-NEXT: TrivialABI (SubjectMatchRule_record)
 // CHECK-NEXT: Uninitialized (SubjectMatchRule_variable_is_local)
-// CHECK-NEXT: UnsafeBufferUsage (SubjectMatchRule_function)
+// CHECK-NEXT: UnsafeBufferUsage (SubjectMatchRule_function, SubjectMatchRule_field)
 // CHECK-NEXT: UseHandle (SubjectMatchRule_variable_is_parameter)
 // CHECK-NEXT: VTablePointerAuthentication (SubjectMatchRule_record)
 // CHECK-NEXT: VecReturn (SubjectMatchRule_record)
diff --git a/clang/test/SemaCXX/warn-unsafe-buffer-usage-field-attr.cpp b/clang/test/SemaCXX/warn-unsafe-buffer-usage-field-attr.cpp
new file mode 100644
index 00000000000000..0ba605475925b9
--- /dev/null
+++ b/clang/test/SemaCXX/warn-unsafe-buffer-usage-field-attr.cpp
@@ -0,0 +1,190 @@
+// RUN: %clang_cc1 -std=c++20 -Wunsafe-buffer-usage \
+// RUN:            -fsafe-buffer-usage-suggestions -verify %s
+
+using size_t = __typeof(sizeof(int));
+
+namespace std {
+  class type_info;
+  class bad_cast;
+  class bad_typeid;
+
+  template <typename T> class span {
+
+  private:
+    T *elements;
+    size_t size_;
+
+  public:
+    span(T *, size_t){}
+
+    constexpr T* data() const noexcept {
+      return elements;
+    }
+
+    constexpr size_t size() const noexcept {
+      return size_;
+    }
+
+  };
+}
+
+struct A {
+    [[clang::unsafe_buffer_usage]]
+    int *ptr;
+
+    size_t sz;
+};
+
+struct B {
+   A a;
+ 
+   [[clang::unsafe_buffer_usage]]
+   int buf[];
+};
+
+struct D { 
+  [[clang::unsafe_buffer_usage]]
+  int *ptr, *ptr2;
+
+  [[clang::unsafe_buffer_usage]]
+  int buf[10];
+ 
+  size_t sz;
+  
+};
+
+void foo(int *ptr);
+
+void foo_safe(std::span<int> sp);
+
+int* test_atribute_struct(A a) {
+   int b = *(a.ptr); //expected-warning{{field 'ptr' prone to unsafe buffer manipulation}}
+   a.sz++;
+   // expected-warning at +1{{unsafe pointer arithmetic}}
+   return a.ptr++; //expected-warning{{field 'ptr' prone to unsafe buffer manipulation}}
+}
+
+void test_attribute_field_deref_chain(B b) {
+  int *ptr = b.a.ptr;//expected-warning{{field 'ptr' prone to unsafe buffer manipulation}}
+  foo(b.buf); //expected-warning{{field 'buf' prone to unsafe buffer manipulation}}
+}
+
+void test_writes_from_span(std::span<int> sp) {
+  A a;
+  a.ptr = sp.data(); //expected-warning{{field 'ptr' prone to unsafe buffer manipulation}}
+  a.sz = sp.size();
+
+  a.ptr = nullptr; // expected-warning{{field 'ptr' prone to unsafe buffer manipulation}}
+}
+
+void test_reads_to_span(A a, A b) {
+  //expected-warning at +1{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
+  std::span<int> sp {a.ptr, a.sz}; //expected-warning{{field 'ptr' prone to unsafe buffer manipulation}}
+
+  // expected-warning at +1 3{{field 'ptr' prone to unsafe buffer manipulation}}
+  if(a.ptr != nullptr && a.ptr != b.ptr) {
+    foo_safe(sp);
+  }
+
+}
+
+void test_attribute_multiple_fields (D d) {
+   int *p =d.ptr; //expected-warning{{field 'ptr' prone to unsafe buffer manipulation}}
+   p = d.ptr2; //expected-warning{{field 'ptr2' prone to unsafe buffer manipulation}}
+
+   p = d.buf; //expected-warning{{field 'buf' prone to unsafe buffer manipulation}}
+
+   int v = d.buf[0]; //expected-warning{{field 'buf' prone to unsafe buffer manipulation}}
+
+   //expected-warning at +1{{unsafe buffer access}}
+   v = d.buf[5]; //expected-warning{{field 'buf' prone to unsafe buffer manipulation}}
+}
+
+template <typename T>
+struct TemplateArray {
+  [[clang::unsafe_buffer_usage]]
+  T *buf;
+
+  [[clang::unsafe_buffer_usage]]
+  size_t sz;
+};
+
+
+void test_struct_template (TemplateArray<int> t) {
+  int *p = t.buf; //expected-warning{{field 'buf' prone to unsafe buffer manipulation}}
+  size_t s = t.sz; //expected-warning{{field 'sz' prone to unsafe buffer manipulation}}
+}
+
+class R {
+  [[clang::unsafe_buffer_usage]]
+  int *array;
+
+  public:
+   int* getArray() {
+     return array; //expected-warning{{field 'array' prone to unsafe buffer manipulation}}
+   }
+ 
+   void setArray(int *arr) {
+     array = arr; //expected-warning{{field 'array' prone to unsafe buffer manipulation}}
+   }
+};
+
+template<class P>
+class Q {
+  [[clang::unsafe_buffer_usage]]
+  P *array;
+
+  public:
+   P* getArray() {
+     return array; //expected-warning{{field 'array' prone to unsafe buffer manipulation}}
+   }
+
+   void setArray(P *arr) {
+     array = arr; //expected-warning{{field 'array' prone to unsafe buffer manipulation}}
+   }
+};
+
+void test_class_template(Q<R> q) {
+   q.getArray();
+   q.setArray(nullptr);
+}
+
+struct AnonSFields {
+ struct {
+  [[clang::unsafe_buffer_usage]]
+  int a;
+ };
+};
+
+void test_anon_struct_fields(AnonSFields anon) {
+  int val = anon.a; //expected-warning{{field 'a' prone to unsafe buffer manipulation}}
+}
+
+union Union {
+  [[clang::unsafe_buffer_usage]]
+  int *ptr1;
+
+  int ptr2;
+};
+
+struct C {
+  Union ptr;
+};
+
+void test_attribute_union(C c) {
+  int *p = c.ptr.ptr1; //expected-warning{{field 'ptr1' prone to unsafe buffer manipulation}}
+
+  int address = c.ptr.ptr2;
+}
+
+struct AnonFields2 { 
+  [[clang::unsafe_buffer_usage]] 
+  struct { 
+    int a; 
+  }; 
+};
+
+void test_anon_struct(AnonFields2 af) {
+  int val = af.a; // No warning here, as the attribute is not explicitly attached to field 'a'
+  val++;
+}



More information about the cfe-commits mailing list