[PATCH] D64512: [InstCombine] Dropping redundant masking before left-shift [0/5] (PR42563)
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 16 10:05:23 PDT 2019
lebedev.ri added a comment.
To follow-up on inline comment - **right now** (llvm master vs llvm master with this patchset; rawspeed develop with no patches ontop) this fold happens once:
$ /repositories/llvm-test-suite/utils/compare.py -m instcombine.MasksDroped /builddirs/llvm-project/build-llvm-test-suite-{old,new}/results.json
/repositories/llvm-test-suite/utils/compare.py:109: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.
To accept the future behavior, pass 'sort=False'.
To retain the current behavior and silence the warning, pass 'sort=True'.
d = pd.concat(datasets, axis=0, names=['run'], keys=datasetnames)
Tests: 198
Metric: instcombine.MasksDroped
Program results results0 diff
test-suite :: RawSpeed/RawSpeed.test NaN 1.00 nan%
test-suite...AllocatorAdaptorBenchmark.test NaN NaN nan%
test-suite...lateDecompressorBenchmark.test NaN NaN nan%
test-suite...sRawInterpolatorBenchmark.test NaN NaN nan%
test-suite...eed/io/BitStreamBenchmark.test NaN NaN nan%
test-suite...a/CameraMetaDataBenchmark.test NaN NaN nan%
test-suite...fDecoderFuzzer-ArwDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-Cr2Decoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-DcrDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-DcsDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-DngDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-ErfDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-IiqDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-KdcDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-MefDecoder.test NaN NaN nan%
Geomean difference nan%
results results0 diff
count 0.0 1.0 0.0
mean NaN 1.0 NaN
std NaN NaN NaN
min NaN 1.0 NaN
25% NaN 1.0 NaN
50% NaN 1.0 NaN
75% NaN 1.0 NaN
max NaN 1.0 NaN
/builddirs/llvm-project/build-llvm-test-suite-new$ grep -r "instcombine.MasksDroped"
RawSpeed/build/CMakeFiles/rawspeed.dir/src/librawspeed/decompressors/SamsungV0Decompressor.stats: "instcombine.MasksDroped": 1,
results.json: "instcombine.MasksDroped": 1.0,
I'm expecting that this number will be better in the end, when all the bits are in place.
While there, the previous fold (`dropRedundantMaskingOfLeftShiftInput()`, D63993 <https://reviews.llvm.org/D63993>) is more frequent:
$ /repositories/llvm-test-suite/utils/compare.py -m instcombine.ShiftsCombined /builddirs/llvm-project/build-llvm-test-suite-{old,new}/results.json
/repositories/llvm-test-suite/utils/compare.py:109: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.
To accept the future behavior, pass 'sort=False'.
To retain the current behavior and silence the warning, pass 'sort=True'.
d = pd.concat(datasets, axis=0, names=['run'], keys=datasetnames)
Tests: 198
Metric: instcombine.ShiftsCombined
Program results results0 diff
test-suite :: RawSpeed/RawSpeed.test 26.00 26.00 0.0%
test-suite...AllocatorAdaptorBenchmark.test NaN NaN nan%
test-suite...lateDecompressorBenchmark.test NaN NaN nan%
test-suite...sRawInterpolatorBenchmark.test NaN NaN nan%
test-suite...eed/io/BitStreamBenchmark.test NaN NaN nan%
test-suite...a/CameraMetaDataBenchmark.test NaN NaN nan%
test-suite...fDecoderFuzzer-ArwDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-Cr2Decoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-DcrDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-DcsDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-DngDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-ErfDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-IiqDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-KdcDecoder.test NaN NaN nan%
test-suite...fDecoderFuzzer-MefDecoder.test NaN NaN nan%
Geomean difference nan%
results results0 diff
count 1.0 1.0 1.0
mean 26.0 26.0 0.0
std NaN NaN NaN
min 26.0 26.0 0.0
25% 26.0 26.0 0.0
50% 26.0 26.0 0.0
75% 26.0 26.0 0.0
max 26.0 26.0 0.0
The performance implications of **this** patchset are as i expected them to be:
raw.pixls.us-unique/Samsung/NX30$ //usr/src/googlebenchmark/tools/compare.py -a benchmarks /builddirs/llvm-project/build-llvm-test-suite-{old,new}/RawSpeed/build/src/utilities/rsbench/rsbench --benchmark_counters_tabular=true --benchmark_repetitions=128 --benchmark_min_time=0.00000001 2015-03-07-163604_sam_7204.srw
RUNNING: /builddirs/llvm-project/build-llvm-test-suite-old/RawSpeed/build/src/utilities/rsbench/rsbench --benchmark_counters_tabular=true --benchmark_repetitions=128 --benchmark_min_time=0.00000001 2015-03-07-163604_sam_7204.srw --benchmark_display_aggregates_only=true --benchmark_out=/tmp/tmp3aOcOJ
2019-07-16 19:55:51
Running /builddirs/llvm-project/build-llvm-test-suite-old/RawSpeed/build/src/utilities/rsbench/rsbench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
L1 Data 16K (x8)
L1 Instruction 64K (x4)
L2 Unified 2048K (x4)
L3 Unified 8192K (x1)
Load Average: 0.88, 0.76, 1.53
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations CPUTime,s CPUTime/WallTime Pixels Pixels/CPUTime Pixels/WallTime Raws/CPUTime Raws/WallTime WallTime,s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2015-03-07-163604_sam_7204.srw/threads:1/process_time/real_time_mean 136 ms 136 ms 128 0.135973 0.99989 20.5978M 151.485M 151.468M 7.35441 7.3536 0.135988
2015-03-07-163604_sam_7204.srw/threads:1/process_time/real_time_median 136 ms 136 ms 128 0.135954 1 20.5978M 151.507M 151.507M 7.35546 7.35547 0.135953
2015-03-07-163604_sam_7204.srw/threads:1/process_time/real_time_stddev 0.212 ms 0.193 ms 128 193.542u 237.857u 0 215.294k 236.262k 0.0104522 0.0114703 212.466u
RUNNING: /builddirs/llvm-project/build-llvm-test-suite-new/RawSpeed/build/src/utilities/rsbench/rsbench --benchmark_counters_tabular=true --benchmark_repetitions=128 --benchmark_min_time=0.00000001 2015-03-07-163604_sam_7204.srw --benchmark_display_aggregates_only=true --benchmark_out=/tmp/tmpWIQdvn
2019-07-16 19:56:10
Running /builddirs/llvm-project/build-llvm-test-suite-new/RawSpeed/build/src/utilities/rsbench/rsbench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
L1 Data 16K (x8)
L1 Instruction 64K (x4)
L2 Unified 2048K (x4)
L3 Unified 8192K (x1)
Load Average: 0.92, 0.78, 1.52
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations CPUTime,s CPUTime/WallTime Pixels Pixels/CPUTime Pixels/WallTime Raws/CPUTime Raws/WallTime WallTime,s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2015-03-07-163604_sam_7204.srw/threads:1/process_time/real_time_mean 131 ms 131 ms 128 0.131231 0.999992 20.5978M 156.959M 156.958M 7.62015 7.6201 0.131232
2015-03-07-163604_sam_7204.srw/threads:1/process_time/real_time_median 131 ms 131 ms 128 0.131175 1 20.5978M 157.026M 157.024M 7.62343 7.6233 0.131177
2015-03-07-163604_sam_7204.srw/threads:1/process_time/real_time_stddev 0.218 ms 0.218 ms 128 217.942u 33.6202u 0 259.861k 259.966k 0.0126159 0.012621 218.033u
Comparing /builddirs/llvm-project/build-llvm-test-suite-old/RawSpeed/build/src/utilities/rsbench/rsbench to /builddirs/llvm-project/build-llvm-test-suite-new/RawSpeed/build/src/utilities/rsbench/rsbench
Benchmark Time CPU Time Old Time New CPU Old CPU New
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
2015-03-07-163604_sam_7204.srw/threads:1/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 128 vs 128
2015-03-07-163604_sam_7204.srw/threads:1/process_time/real_time_mean -0.0350 -0.0349 136 131 136 131
2015-03-07-163604_sam_7204.srw/threads:1/process_time/real_time_median -0.0351 -0.0352 136 131 136 131
2015-03-07-163604_sam_7204.srw/threads:1/process_time/real_time_stddev +0.0260 +0.1253 0 0 0 0
-4% improvement is notable.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D64512/new/
https://reviews.llvm.org/D64512
More information about the llvm-commits
mailing list