[PATCH] InstrProf: Calculate a better function hash

Duncan P. N. Exon Smith dexonsmith at apple.com
Wed Apr 9 07:46:53 PDT 2014


ping

On 2014 Apr 2, at 14:35, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:

> ping
> 
> On Mar 26, 2014, at 16:41, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
>> 
>> On Mar 26, 2014, at 4:02 PM, Chandler Carruth <chandlerc at google.com> wrote:
>> 
>>> 
>>> On Wed, Mar 26, 2014 at 3:53 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>> On Mar 26, 2014, at 3:12 PM, Chandler Carruth <chandlerc at google.com> wrote:
>>> 
>>>> +uint64_t PGOHash::finalize() {
>>>> +  // Use Working as the hash directly if we never used MD5.
>>>> +  if (Count <= NumTypesPerWord)
>>>> +    return Working;
>>>> 
>>>> Doesn't this need to byteswap to LE in the event of a BE host?
>>> 
>>> I don’t think so.
>>> 
>>> All the math on Working is 64-bit integer math.  In IRGen with
>>> -fprofile-instr-generate, it's emitted as an i64.  The various writers
>>> and readers treat it as a 64-bit value (byte swapping as necessary)
>>> until -fprofile-instr-use sees it on the other side.
>>> 
>>> So, I still don't understand how this works: host = LE, target = BE; the IRGen writes an i64 bigendian hash, the profile reader reads an i64 littleendian hash, boom? I'm probably missing a step...
>>> 
>>> 
>>> We do need to byte swap the inputs and outputs to MD5, since its
>>> interface uses uint8_t.
>>> 
>>> Yea, this part makes sense.
>>> 
>>> 
>>> Does this deserve a comment somewhere?
>>> 
>>> Based on my comment above, yea, because I still don't get it. =D
>>> 
>>> 
>>>> And then one comment that has nothing to do with the patch, but is just a general comment going forward:
>>>> 
>>>> +/// \brief Stable hasher for PGO region counters.
>>>> +///
>>>> +/// PGOHash produces a stable hash of a given function's control flow.
>>>> +///
>>>> +/// Changing the output of this hash will invalidate all previously generated
>>>> +/// profiles -- i.e., don't do it.
>>>> 
>>>> Is there some version somewhere that we can bump once a <insert epoch> and change this? I mention this because at some point, we'll want >6 bits, or we'll want to switch from MD5 to MDN+1 or whatever, and we should have *some* recourse for updating it on the scale of O(years).
>>> 
>>> Yeah, this is essential.  The idea is to tie it to the version number
>>> in the indexed binary format, which unfortunately doesn’t exist yet.
>>> 
>>> Cool, feel free to document it with a fixme to actually add the version. =D Clearly, we don't have the format available to put it in, but its good to know that it needs to be stable, but there is *some* escape hatch somewhere.
>> 
>> New patch attached; I think it addresses all of your comments.
>> 
>> <pgo-hash-3.patch>_______________________________________________
>> cfe-commits mailing list
>> cfe-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
> <pgo-hash-3.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgo-hash-3.patch
Type: application/octet-stream
Size: 33153 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140409/23aa1a1a/attachment.obj>


More information about the cfe-commits mailing list