[libc-commits] [libc] [libc][gpu] Disable loop unrolling in the throughput benchmark loop by default (PR #153971)
Leandro Lacerda via libc-commits
libc-commits at lists.llvm.org
Sat Aug 16 11:46:52 PDT 2025
leandrolcampos wrote:
Here's what I get on my *NVIDIA GeForce RTX 4070 Laptop GPU*.
```bash
[1/4] Running hermetic test libc.benchmarks.gpu.src.ctype.isalnum_benchmark
Running Suite: LlvmLibcIsAlNumGpuBenchmark
Benchmark | Cycles (Mean) | Stddev | Min | Max | Iterations | Threads |
------------------------------------------------------------------------------------------------------
IsAlnum | 53 | 0 | 53 | 53 | 11904 | 64 |
IsAlnumSingleThread | 53 | 0 | 53 | 53 | 186 | 1 |
IsAlnumSingleWave | 53 | 0 | 53 | 53 | 5952 | 32 |
IsAlnumCapital | 53 | 0 | 53 | 53 | 11904 | 64 |
IsAlnumNotAlnum | 43 | 0 | 43 | 43 | 11904 | 64 |
[2/4] Running hermetic test libc.benchmarks.gpu.src.ctype.isalpha_benchmark
Running Suite: LlvmLibcIsAlphaGpuBenchmark
Benchmark | Cycles (Mean) | Stddev | Min | Max | Iterations | Threads |
------------------------------------------------------------------------------------------------------
IsAlpha | 53 | 0 | 53 | 53 | 186 | 1 |
[3/4] Running hermetic test libc.benchmarks.gpu.src.math.sin_benchmark
Running Suite: LlvmLibcSinGpuBenchmark
Benchmark | Cycles (Mean) | Stddev | Min | Max | Iterations | Threads |
------------------------------------------------------------------------------------------------------
Sin_1 | 3122 | 153 | 2933 | 3607 | 2735008 | 32 |
Sin_128 | 2696 | 15 | 2651 | 2739 | 17024 | 32 |
Sin_1024 | 2881 | 5 | 2872 | 2890 | 1344 | 32 |
Sin_4096 | 2895 | 2 | 2891 | 2899 | 352 | 32 |
SinTwoPi_1 | 2219 | 12 | 2204 | 2517 | 24032 | 32 |
SinTwoPi_128 | 2047 | 2 | 2044 | 2051 | 1344 | 32 |
SinTwoPi_1024 | 2253 | 0 | 2253 | 2254 | 576 | 32 |
SinTwoPi_4096 | 2272 | 0 | 2272 | 2272 | 352 | 32 |
SinTwoPow30_1 | 3135 | 17 | 3111 | 3364 | 8480 | 32 |
SinTwoPow30_128 | 2734 | 1 | 2732 | 2736 | 352 | 32 |
SinTwoPow30_1024 | 2940 | 0 | 2940 | 2941 | 352 | 32 |
SinTwoPow30_4096 | 2958 | 0 | 2958 | 2959 | 352 | 32 |
SinVeryLarge_1 | 2858 | 16 | 2823 | 3093 | 8480 | 32 |
SinVeryLarge_128 | 2402 | 2 | 2398 | 2406 | 352 | 32 |
SinVeryLarge_1024 | 2599 | 0 | 2599 | 2600 | 352 | 32 |
SinVeryLarge_4096 | 2615 | 0 | 2615 | 2615 | 352 | 32 |
NvSin_1 | 2522 | 69 | 2261 | 2880 | 5952 | 32 |
NvSin_128 | 1826 | 2 | 1824 | 1830 | 576 | 32 |
NvSin_1024 | 2035 | 0 | 2035 | 2036 | 352 | 32 |
NvSin_4096 | 2053 | 0 | 2053 | 2053 | 352 | 32 |
NvSinTwoPi_1 | 1107 | 1 | 1104 | 1108 | 2880 | 32 |
NvSinTwoPi_128 | 891 | 0 | 891 | 891 | 352 | 32 |
NvSinTwoPi_1024 | 1102 | 0 | 1101 | 1102 | 352 | 32 |
NvSinTwoPi_4096 | 1122 | 0 | 1122 | 1122 | 352 | 32 |
NvSinTwoPow30_1 | 1106 | 1 | 1105 | 1108 | 1344 | 32 |
NvSinTwoPow30_128 | 891 | 0 | 891 | 891 | 352 | 32 |
NvSinTwoPow30_1024 | 1101 | 0 | 1101 | 1101 | 352 | 32 |
NvSinTwoPow30_4096 | 1122 | 0 | 1122 | 1122 | 352 | 32 |
NvSinVeryLarge_1 | 2497 | 23 | 2251 | 2845 | 12032 | 32 |
NvSinVeryLarge_128 | 1790 | 1 | 1789 | 1792 | 576 | 32 |
NvSinVeryLarge_1024 | 1999 | 0 | 1999 | 1999 | 352 | 32 |
NvSinVeryLarge_4096 | 2019 | 0 | 2019 | 2019 | 352 | 32 |
Sinf_1 | 2201 | 170 | 1522 | 2400 | 507776 | 32 |
Sinf_128 | 1872 | 13 | 1830 | 1898 | 2880 | 32 |
Sinf_1024 | 2056 | 5 | 2047 | 2068 | 1984 | 32 |
Sinf_4096 | 2093 | 3 | 2088 | 2098 | 352 | 32 |
SinfTwoPi_1 | 1442 | 11 | 1426 | 1759 | 33856 | 32 |
SinfTwoPi_128 | 1126 | 1 | 1125 | 1129 | 352 | 32 |
SinfTwoPi_1024 | 1314 | 0 | 1314 | 1315 | 352 | 32 |
SinfTwoPi_4096 | 1350 | 0 | 1350 | 1350 | 352 | 32 |
SinfTwoPow30_1 | 1088 | 10 | 1080 | 1162 | 1984 | 32 |
SinfTwoPow30_128 | 771 | 1 | 771 | 774 | 1984 | 32 |
SinfTwoPow30_1024 | 961 | 0 | 960 | 962 | 352 | 32 |
SinfTwoPow30_4096 | 997 | 0 | 997 | 997 | 352 | 32 |
SinfVeryLarge_1 | 1925 | 14 | 1869 | 2282 | 24032 | 32 |
SinfVeryLarge_128 | 1598 | 1 | 1598 | 1600 | 352 | 32 |
SinfVeryLarge_1024 | 1788 | 0 | 1787 | 1789 | 352 | 32 |
SinfVeryLarge_4096 | 1824 | 0 | 1824 | 1824 | 352 | 32 |
NvSinf_1 | 1024 | 6 | 1019 | 1043 | 1984 | 32 |
NvSinf_128 | 742 | 0 | 742 | 744 | 576 | 32 |
NvSinf_1024 | 932 | 0 | 932 | 933 | 352 | 32 |
NvSinf_4096 | 967 | 0 | 967 | 967 | 352 | 32 |
NvSinfTwoPi_1 | 162 | 3 | 162 | 497 | 362464 | 32 |
NvSinfTwoPi_128 | 107 | 0 | 107 | 109 | 2880 | 32 |
NvSinfTwoPi_1024 | 297 | 0 | 297 | 297 | 352 | 32 |
NvSinfTwoPi_4096 | 334 | 0 | 334 | 334 | 352 | 32 |
NvSinfTwoPow30_1 | 1026 | 11 | 1018 | 1281 | 33856 | 32 |
NvSinfTwoPow30_128 | 742 | 0 | 741 | 742 | 896 | 32 |
NvSinfTwoPow30_1024 | 931 | 0 | 931 | 931 | 352 | 32 |
NvSinfTwoPow30_4096 | 967 | 0 | 967 | 967 | 352 | 32 |
NvSinfVeryLarge_1 | 1003 | 1 | 1000 | 1004 | 1984 | 32 |
NvSinfVeryLarge_128 | 723 | 0 | 723 | 723 | 352 | 32 |
NvSinfVeryLarge_1024 | 913 | 0 | 913 | 913 | 352 | 32 |
NvSinfVeryLarge_4096 | 949 | 0 | 949 | 949 | 352 | 32 |
[4/4] Running hermetic test libc.benchmarks.gpu.src.math.atan2_benchmark
Running Suite: LlvmLibcAtan2GpuBenchmark
Benchmark | Cycles (Mean) | Stddev | Min | Max | Iterations | Threads |
------------------------------------------------------------------------------------------------------
Atan2_1 | 4082 | 954 | 1892 | 5271 | 24032 | 32 |
Atan2_128 | 3852 | 80 | 3531 | 4112 | 131648 | 32 |
Atan2_1024 | 4083 | 31 | 3991 | 4150 | 2880 | 32 |
Atan2_4096 | 4080 | 16 | 4058 | 4111 | 576 | 32 |
Atan2TwoPi_1 | 2738 | 16 | 2728 | 3162 | 24032 | 32 |
Atan2TwoPi_128 | 2511 | 2 | 2508 | 2515 | 352 | 32 |
Atan2TwoPi_1024 | 2743 | 0 | 2742 | 2743 | 352 | 32 |
Atan2TwoPi_4096 | 2744 | 0 | 2744 | 2745 | 352 | 32 |
Atan2TwoPow30_1 | 2734 | 15 | 2721 | 3148 | 24032 | 32 |
Atan2TwoPow30_128 | 2517 | 2 | 2512 | 2525 | 1344 | 32 |
Atan2TwoPow30_1024 | 2743 | 0 | 2743 | 2744 | 352 | 32 |
Atan2TwoPow30_4096 | 2744 | 0 | 2744 | 2744 | 352 | 32 |
Atan2Large_1 | 3570 | 382 | 1125 | 3882 | 131648 | 32 |
Atan2Large_128 | 3352 | 37 | 3280 | 3421 | 1984 | 32 |
Atan2Large_1024 | 3578 | 10 | 3554 | 3601 | 1984 | 32 |
Atan2Large_4096 | 3576 | 6 | 3566 | 3586 | 576 | 32 |
NvAtan2_1 | 2909 | 38 | 2866 | 3339 | 17024 | 32 |
NvAtan2_128 | 2801 | 2 | 2798 | 2805 | 352 | 32 |
NvAtan2_1024 | 3040 | 1 | 3039 | 3041 | 352 | 32 |
NvAtan2_4096 | 3041 | 1 | 3040 | 3042 | 352 | 32 |
NvAtan2TwoPi_1 | 2032 | 13 | 2032 | 2386 | 24032 | 32 |
NvAtan2TwoPi_128 | 1945 | 1 | 1945 | 1947 | 352 | 32 |
NvAtan2TwoPi_1024 | 2185 | 0 | 2184 | 2185 | 352 | 32 |
NvAtan2TwoPi_4096 | 2185 | 0 | 2185 | 2186 | 352 | 32 |
NvAtan2TwoPow30_1 | 2032 | 8 | 2032 | 2184 | 12032 | 32 |
NvAtan2TwoPow30_128 | 1945 | 1 | 1945 | 1951 | 896 | 32 |
NvAtan2TwoPow30_1024 | 2184 | 0 | 2184 | 2184 | 352 | 32 |
NvAtan2TwoPow30_4096 | 2185 | 0 | 2185 | 2186 | 352 | 32 |
NvAtan2Large_1 | 2032 | 12 | 2032 | 2359 | 24032 | 32 |
NvAtan2Large_128 | 1945 | 1 | 1945 | 1951 | 896 | 32 |
NvAtan2Large_1024 | 2184 | 0 | 2184 | 2185 | 352 | 32 |
NvAtan2Large_4096 | 2185 | 0 | 2185 | 2185 | 352 | 32 |
```
https://github.com/llvm/llvm-project/pull/153971
More information about the libc-commits
mailing list