[llvm-dev] GSoC Proposal : Path Profiling Support
Snehasish Kumar via llvm-dev
llvm-dev at lists.llvm.org
Wed Mar 16 14:03:52 PDT 2016
Hi David,
> Are the data below all collected when only one function is picked for
> instrumentation?
Yes, here is a list of the benchmarks and selected functions.
+-----------------+----------------------------------------------------------------------------------------------+
| blks | _Z19BlkSchlsEqEuroNoDivfffffif
|
+-----------------+----------------------------------------------------------------------------------------------+
| bodytrack |
_ZN17ImageMeasurements11InsideErrorERK17ProjectedCylinderRK11BinaryImageRiS6_
|
+-----------------+----------------------------------------------------------------------------------------------+
| bzip2 | BZ2_compressBlock
|
+-----------------+----------------------------------------------------------------------------------------------+
| ferret | image_segment
|
+-----------------+----------------------------------------------------------------------------------------------+
| fluidanimate | _Z13ComputeForcesv
|
+-----------------+----------------------------------------------------------------------------------------------+
| freqmine |
_Z32FPArray_conditional_pattern_baseIhEiP7FP_treeiiT_
|
+-----------------+----------------------------------------------------------------------------------------------+
| gcc | bitmap_operation
|
+-----------------+----------------------------------------------------------------------------------------------+
| hmmer | P7Viterbi
|
+-----------------+----------------------------------------------------------------------------------------------+
| lbm | LBM_performStreamCollide
|
+-----------------+----------------------------------------------------------------------------------------------+
| mcf | price_out_impl
|
+-----------------+----------------------------------------------------------------------------------------------+
| mcf2000 | price_out_impl
|
+-----------------+----------------------------------------------------------------------------------------------+
| namd |
_ZN20ComputeNonbondedUtil26calc_pair_energy_fullelectEP9nonbonded
|
+-----------------+----------------------------------------------------------------------------------------------+
| povray |
_ZN3povL24All_Sphere_IntersectionsEPNS_13Object_StructEPNS_10Ray_StructEPNS_13istack_structE
|
+-----------------+----------------------------------------------------------------------------------------------+
| sjeng | gen
|
+-----------------+----------------------------------------------------------------------------------------------+
| soplex | _ZN6soplex9CLUFactor16vSolveUrightNoNZEPdS1_Piid
|
+-----------------+----------------------------------------------------------------------------------------------+
| sphinx | vector_gautbl_eval_logs3
|
+-----------------+----------------------------------------------------------------------------------------------+
| streamcluster | _Z5pgainlP6PointsdPliP17pthread_barrier_t
|
+-----------------+----------------------------------------------------------------------------------------------+
| swaptions | _Z21HJM_Swaption_BlockingPddddddiidS_PS_llii
|
+-----------------+----------------------------------------------------------------------------------------------+
| h264ref | dct_luma_16x16
|
+-----------------+----------------------------------------------------------------------------------------------+
> Do you have data when such manual selection is not done?
At the moment, I do not.
>
> thanks,
>
> David
>
>
>>
>> numpaths = Number of possible paths
>> epp+compile = Time taken to compute encoding, insert instrumentation and
>> compile to executable
>> compile = Time taken to compile to executable
>> execpaths = Number of paths dynamically executed
>> epp-exec-time = Execution time with instrumentation
>> exec-time = Normal execution time
>> epp-bin-size = Size of instrumented binary in bytes
>> bin-size = Size of binary
>> ** size of shared library in bytes = 598042
>>
>>
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | benchmark | numpaths | epp+compile | compile | execpaths |
>> epp-exec-time | exec-time | epp-bin-size | bin-size |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | blks | 2 | 0m1.036s | 0m1.008s | 2 |
>> 0m3.643s | 0m3.205s | 155931 | 155459 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | bodytrack | 29 | 0m4.907s | 0m4.881s | 5 |
>> 0m14.786s | 0m1.943s | 2125256 | 2124224 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | bzip2 | 60 | 0m1.274s | 0m1.268s | 3 |
>> 0m9.441s | 0m9.624s | 259125 | 258477 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | ferret | 360921 | 0m26.208s | 0m26.102s | 40 |
>> 0m10.342s | 0m6.224s | 8342571 | 8338588 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | fluidanimate | 384117 | 0m0.895s | 0m0.869s | 88 |
>> 0m56.631s | 0m1.294s | 202702 | 197878 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | freqmine | 45 | 0m1.220s | 0m1.214s | 18 |
>> 0m22.150s | 0m5.515s | 278615 | 277656 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | gcc | 6026 | 0m31.941s | 0m31.327s | 125 |
>> 1m30.139s | 0m36.601s | 6991413 | 6991245 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | hmmer | 1882 | 0m3.193s | 0m3.232s | 65 |
>> 0m58.911s | 0m2.474s | 744510 | 742806 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | mcf | 230 | 0m0.838s | 0m0.830s | 10 |
>> 0m11.097s | 0m3.074s | 162680 | 161736 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | mcf2000 | 1155 | 0m0.859s | 0m0.853s | 26 |
>> 0m24.169s | 0m4.625s | 166092 | 165213 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | povray | 17 | 0m8.543s | 0m8.552s | 4 |
>> 9m24.562s | 5m39.295s | 2388152 | 2387960 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | sjeng | 158740 | 0m1.648s | 0m1.637s | 280 |
>> 0m20.786s | 0m5.229s | 368841 | 368009 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | soplex | 30 | 0m4.849s | 0m4.848s | 24 |
>> 7m28.151s | 4m10.813s | 1244775 | 1242063 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | sphinx | 26 | 0m2.212s | 0m2.198s | 5 |
>> 1m36.291s | 0m13.811s | 543534 | 543358 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | streamcluster | 21121728 | 0m0.947s | 0m0.908s | 33 |
>> 0m50.212s | 0m5.986s | 191981 | 185438 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | swaptions | 20655 | 0m0.965s | 0m0.950s | 13 |
>> 0m0.263s | 0m0.178s | 193841 | 184274 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | h264ref | 24130 | 0m4.278s | 0m4.272s | 76 |
>> 3m26.701s | 3m4.461s | 816660 | 812396 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | lbm | 8 | 0m0.824s | 0m0.815s | 5 |
>> 6m29.685s | 1m39.180s | 150871 | 150327 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>> | namd | 59598954 | 0m4.124s | 0m4.139s | 43 |
>> 18m36.447s | 6m50.288s | 925863 | 925271 |
>>
>> +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
>>
>>
>>
>> > > Open Issues :
>> > > + Update PathProfileInfo on CFG transformations ?
>>
>> > Could you clarify what this means?
>>
>> Changing the control flow graph of a routine may invalidate collected path
>> profiles. For example, splitting a block with an unconditional branch does
>> not change the profile, but introducing a conditional branch invalidates the
>> profile. The issue I would like to address is which transformations should
>> we allow as safe transformations and how should we update the internal path
>> profile data structures if we allow this at all.
>>
>> > > + Verify with PGOEdge info ?
>>
>> > Ditto.
>>
>> Verification with PGOEdge info implies that the edge frequencies derived
>> from path profiles and via instrprof should be equal.
>>
>> > > + Handle setjmp, longjmp, early program termination, noreturn calls
>>
>> > How do you handle indirect calls?
>>
>> No special handling of indirect calls as path profiles are
>> intra-procedural and control returns to same basic block
>> after call in the general case. For the above mentioned cases, control may
>> not return.
>>
>>
>> Regards,
>> Snehasish
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
More information about the llvm-dev
mailing list