[all-commits] [llvm/llvm-project] da25f9: [flang] Runtime performance improvements to real f...

Peter Klausler via All-commits all-commits at lists.llvm.org
Fri Nov 12 11:40:18 PST 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: da25f968a90ad4560fc920a6d18fc2a0221d2750
      https://github.com/llvm/llvm-project/commit/da25f968a90ad4560fc920a6d18fc2a0221d2750
  Author: Peter Klausler <pklausler at nvidia.com>
  Date:   2021-11-12 (Fri, 12 Nov 2021)

  Changed paths:
    M flang/include/flang/Decimal/decimal.h
    M flang/include/flang/Runtime/descriptor.h
    M flang/lib/Decimal/big-radix-floating-point.h
    M flang/lib/Decimal/decimal-to-binary.cpp
    M flang/runtime/descriptor.cpp
    M flang/runtime/edit-input.cpp
    M flang/runtime/internal-unit.cpp
    M flang/runtime/internal-unit.h
    M flang/runtime/io-stmt.cpp
    M flang/runtime/io-stmt.h
    M flang/runtime/unit.cpp
    M flang/runtime/unit.h
    M flang/unittests/Runtime/NumericalFormatTest.cpp

  Log Message:
  -----------
  [flang] Runtime performance improvements to real formatted input

Profiling a basic internal real input read benchmark shows some
hot spots in the code used to prepare input for decimal-to-binary
conversion, which is of course where the time should be spent.
The library that implements decimal to/from binary conversions has
been optimized, but not the code in the Fortran runtime that calls it,
and there are some obvious light changes worth making here.

Move some member functions from *.cpp files into the class definitions
of Descriptor and IoStatementState to enable inlining and specialization.

Make GetNextInputBytes() the new basic input API within the
runtime, replacing GetCurrentChar() -- which is rewritten in terms of
GetNextInputBytes -- so that input routines can have the
ability to acquire more than one input character at a time
and amortize overhead.

These changes speed up the time to read 1M random reals
using internal I/O from a character array from 1.29s to 0.54s
on my machine, which on par with Intel Fortran and much faster than
GNU Fortran.

Differential Revision: https://reviews.llvm.org/D113697




More information about the All-commits mailing list