[llvm-dev] RFC: Generalize means the sanitizers work with memory

Mon Feb 27 18:59:27 PST 2017

+Hal

IIRC, Hal mentioned that he did something like this for a no-MMU HPC
environment he was working in.

-- Sean Silva

On Thu, Feb 23, 2017 at 10:16 AM, Ivan A. Kosarev via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> RFC: Generalize means the sanitizers work with memory
>
> Overview
> ========
>
> Currently, LLVM sanitizers, such as Asan and Tsan, are tied to a specific
> memory model that relies on presence of hardware support for virtual
> memory.
> This prevents sanitizers from being used on platforms that lack such
> support,
> but otherwise are capable of running sanitized programs. Our research
> indicates that adding support for such platforms is possible with a
> relatively
> small amount of changes to the sanitizers source code and zero performance
> and
> size penalty on currently supported systems. We also found that these
> changes
> clarify and formalize the functional and performance dependencies between
> sanitizers and system memory so they can be considered an improvement in
> terms of design and readability regardless of the added capabilities. One
> can
> think of it as a zero-cost abstraction layer.
>
>
> The Approach
> ============
>
> To support platforms that do not have hardware virtual memory managers,
> we need to introduce the concept of physical memory pages that work as the
> storage for data that sanitizers currently read and write by virtual
> addresses. In presence of the concept of physical memory, every time we
> access
> virtual memory we have to translate the given virtual address to a physical
> one. For example, this check:
>
>    *(u8 *)MEM_TO_SHADOW(allocated) == 0
>
> becomes:
>
>    *MEM_TO_PSHADOW(allocated) == 0
>
> where the MEM_TO_PSHADOW(mem) macro is defined as:
>
>    #define MEM_TO_PSHADOW(mem) VSHADOW_TO_PSHADOW(MEM_TO_VSHADOW(mem))
>    #define MEM_TO_VSHADOW(mem) /* Whatever currently MEM_TO_SHADOW() is. */
>
> The VSHADOW_TO_PSHADOW(vs) macro returns a pointer to a byte within a
> physical page that corresponds to the given virtual address and allocates
> this
> page if it has not been allocated before. On platforms that leverage
> hardware
> virtual memory managers this macro returns the virtual address as a
> physical
> one:
>
>    #define VSHADOW_TO_PSHADOW(vs) (reinterpret_cast<u8*>((vs)))
>
> Physical pages are required to be aligned by their size. The size of
> physical
> pages is a multiple of the shadow memory granularity (8 bytes for Asan) and
> not less than the size of the widest scalar access we have to support (16
> bytes). This makes trivial finding page offsets, which we need to implement
> RTL functions efficiently. This also simplifies handling of aligned
> accesses
> to physical memory as they are known to not cross bounds of physical pages.
> Note that RTL functions have to be fixed to not rely on specific size,
> location or order of physical pages.
>
> In addition to the facilities that allow handling of individual accesses to
> the virtual memory we also need a set of functions that efficiently perform
> operations on specified ranges of virtual addresses:
>
> // Fills a virtual memory with a given value. May release zeroed pages. For
> // DFsan we may need a version of this function that takes 16-bit values to
> // fill with.
> void vshadow_memset(uptr vs, u8 value, uptr size);
>
> // Similarly to vshadow_memset(), this function fills a range of virtual
> // memory with a given value and additionally claims that range as
> read-only
> // so the memory manager is not required to support modifying accesses for
> // these addresses.
> void fill_rodata_vshadow(uptr vs, u8 value, uptr size);
>
> // Copies potentially overlapping memory regions.
> void vshadow_memmove(uptr dest, uptr src, uptr size);
>
> // Returns the virtual address of the first non-zero byte in a given
> virtual
> // address range. Can also be used to test for zeroed regions.
> uptr find_non_zero_vshadow_byte(uptr vs, uptr size);
>
> // Explicitly releases pages that fit the specified range.
> void release_vshadow(uptr vs, uptr size);
>
>
> The Proof-of-Concept Patch
> ==========================
>
> To make sure the approach is feasible we have prepared a patch that
> fixes the Asan and Tsan RTL and instrumentation parts to translate virtual
> shadow memory addresses to physical ones and mmap() shadow memory as we
> access
> it. This way we simulate a software virtual memory manager that allocates
> physical storage for shadow memory on-demand.
>
> We used that to mock RTL for the sanitizers tests. With this mock in place
> we
> pass all Tsan tests and fail on 3 of 610 Asan tests:
>
> test/asan/TestCases/Linux/cuda_test.cc
> test/asan/TestCases/Linux/nohugepage_test.cc
> test/asan/TestCases/Linux/swapcontext_annotation.cc
>
> The first two tests rely on specific memory map after initializtion of the
> shadow memory and the latter takes too long to complete. It would probably
> be
> acceptable to XFAIL them when run with a software memory manager enabled
> and
> then consider ways to adopt them as necessary on a per-test basis.
>
> * * *
>
> With this paper we propose the changes that make it possible to use
> sanitizers
> on plaforms that have no MMUs to be part of the mainline. However, before
> moving further we would like some feedback from the community so comments
> are
> very appreciated.
>
> If the approach is fine, we will prepare a set of patches shortly.
>
> Thank you,
>
> --
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170227/5ac4d155/attachment.html>