[RFC] Using large pages for large hash tables

Mon Oct 17 10:23:10 PDT 2016

----- Original Message -----
> From: "Rafael Espíndola via llvm-commits" <llvm-commits at lists.llvm.org>
> To: "llvm-commits" <llvm-commits at lists.llvm.org>
> Cc: "Aaron Ballman" <aaron.ballman at gmail.com>
> Sent: Monday, October 17, 2016 10:38:02 AM
> Subject: [RFC] Using large pages for large hash tables
> 
> I did a quick and dirty experiment to use large pages when a hash
> table gets big. The results for lld are pretty impressive (see
> attached file, but basically 1.04X faster link of files with debug
> info).

+    R = madvise(Buckets, Alloc, MADV_HUGEPAGE);
+    assert(R == 0);

I suspect you just did this because you were experimenting, but just in case: Do you really care whether this succeeds? I suspect we don't want to error here.

I wonder if there are other places in LLVM/Clang/etc. where we should do this as well.

 -Hal

> 
> I tested disabling madvise and the performance goes back to what it
> was, so it is really the large pages that improves the performance.
> 
> The main question is then what the interface should look like. On
> linux the abstraction could be
> 
> std::pair<void *, size_t> mallocLarge(size_t Size);
> 
> which return the allocated memory and how much was actually
> allocated.
> The pointer can be passed to free once it is no longer needed.
> 
> The fallback implementation just calls malloc and returns Size
> unmodified.
> 
> On linux x86_64 if size is larger than 2MiB we use posix_memalign and
> madvise.
> 
> Would the same interface work on windows?
> 
> Cheers,
> Rafael
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory