[PATCH] D18055: ELF: Implement --build-id.

Fri Mar 11 00:53:32 PST 2016

On Fri, Mar 11, 2016 at 12:13 AM, Hal Finkel <hfinkel at anl.gov> wrote:

>
> ------------------------------
>
> *From: *"Sean Silva via llvm-commits" <llvm-commits at lists.llvm.org>
> *To: *"Rui Ueyama" <ruiu at google.com>
> *Cc: *reviews+D18055+public+cfa74f4269f8beb5 at reviews.llvm.org,
> "llvm-commits" <llvm-commits at lists.llvm.org>
> *Sent: *Thursday, March 10, 2016 10:59:25 PM
> *Subject: *Re: [PATCH] D18055: ELF: Implement --build-id.
>
>
>
> On Thu, Mar 10, 2016 at 7:56 PM, Rui Ueyama <ruiu at google.com> wrote:
>
>> AES is for encryption, so it's not usable for secure hashing, no?
>>
>> On Thu, Mar 10, 2016 at 6:33 PM, Sean Silva <chisophugis at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Thu, Mar 10, 2016 at 2:51 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>
>>>> For LLD, it hashed 75,421,320 bytes. For Scylla, it hashed 207,703,010
>>>> bytes. So the throughput of MD5 is 444 MB/s and 480 MB/s, respectively.
>>>>
>>>> BLAKE2 claims that it is about 1.6x faster than MD5, but it is probably
>>>> the lower bound for a secure hash function.
>>>>
>>>
>>> AESNI claims to be able to do 1.3 cycles per byte on an old Nehalem.
>>> That is multiple gigabytes per second.
>>>
>>> https://software.intel.com/sites/default/files/m/d/4/1/d/8/10TB24_Breakthrough_AES_Performance_with_Intel_AES_New_Instructions.final.secure.pdf
>>>
>>> Have you tried using ADT/Hashing.h? That is probably the simplest option.
>>>
>>
>> No, I haven't. But if CRC32 is not okay, then it is not okay as well.
>>
>
> I disagree. CRC32 has dramatically weaker characteristics than
> ADT/Hashing.h. CRC polynomials are constructed to detect specific classes
> of errors that are characteristic to physical communication channels. e.g.
> "anything which toggles a single bit is guaranteed to change the CRC value".
>
> One advantage that CRCs might have here over other schemes is that it is
> possible to efficiently compute them in parallel. I wrote a small library
> that does this for CRC64 values a few years ago (
> http://trac.alcf.anl.gov/projects/hpcrc64), and while OpenMP is used for
> the parallelization, which we don't want to use here, it is pretty simple
> to see how the scheme works.
>

Yeah. Even without OpenMP the inner loop of CRC computations typically have
high latency but can be fully pipelined, so it is advantageous to do
multiple computations simultaneously even on a single core. E.g. Intel's
CRC32C instruction has a latency of 3 cycles but is fully pipelined, so to
get max performance you have to be processing 3 blocks at a time and
periodically merge them with CLMUL. With a table lookup implementation it
is even more important because your table lookup latency is load, op, store
to L1D so at least 7 cycles.

I see that your implementation ticks all the check marks (both multiword
(within-a-register parallel) *and*  instruction parallel *and* thread
parallel). Beautiful! I guess my ramblings above should be considered some
extra explanatory comments for your code m(-_-)m

-- Sean Silva

>
>  -Hal
>
> They are not particularly good as hash values.
> ADT/Hashing.h is designed against much more stringent tests (I believe it
> is tested against SMHasher).
>
> The fedora page does not mention any use case that requires
> cryptographically strong hashing.
>
> -- Sean Silva
>
>
>>
>>
>>> -- Sean Silva
>>>
>>>
>>>>
>>>> On Thu, Mar 10, 2016 at 2:16 PM, Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>> silvas added a subscriber: silvas.
>>>>> silvas added a comment.
>>>>>
>>>>> One thing that is missing from the data you provided in the OP is the
>>>>> actual amount of data hashed, and in what pattern (lots of small, or a
>>>>> couple large?). How fast does the hashing have to be to be "acceptable"?
>>>>> E.g. ADT/Hashing.h claims 6.5GB/s hashing for large keys on Nehalem (which
>>>>> is pretty old at this point, modern CPU's likely to better). That is
>>>>> 6.5MB/ms. If we need to hash 100MB then the overhead should be about 15ms.
>>>>>
>>>>> For LLD, you quoted a number 713.78ms link time. So for 1% overhead we
>>>>> need to spend <7ms. This is enough for 45MB hashed with ADT/Hashing.h
>>>>> (which is similar to LLD text+data size), assuming that we can get max
>>>>> memory bandwidth (like Rafael said, we can do this while we would be
>>>>> copying data otherwise, so I don't see a reason we can't hit the full
>>>>> performance of ADT/Hashing.h).
>>>>>
>>>>>
>>>>> http://reviews.llvm.org/D18055
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160311/97dfcac8/attachment.html>