<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p><font size="-1">Not exactly.</font></p>

    <p><font size="-1">A modern CPU frequently has three cache levels

        with each subsequent larger cache level having a slower access.

        Then there is memory beyond cache, and disk beyond memory with

        disk having a cache for frequently used disk data. In addition

        to where the data may be in this hierarchy is the way the data

        is obtained which is in blocks of a certain size.</font></p>

    <p><font size="-1">This link gives a rough idea of how long it takes

        to access various data sources.</font></p>

    <p><font size="-1"><a class="moz-txt-link-freetext" href="http://norvig.com/21-days.html#answers">http://norvig.com/21-days.html#answers</a></font></p>

    <p><font size="-1">Beyond this is how the container is used. For

        example, using an unordered_map into pointers to another data

        location where the data is actually stored provides another path

        for delay.</font></p>

    <p><font size="-1">To actually know if a particular container

        selection and design will work better will likely require

        benchmarking the alternatives under realistic conditions.</font></p>

    <p><font size="-1">Along this line I had thought that NVME SSDs were

        substantially faster than SATA SSDs and then saw some careful

        benchmarking that only showed that NVME SSDs were marginally

        faster which explained their marginal difference in price.<br>

      </font></p>

    <p><font size="-1">Neil Nelson<br>

      </font></p>

    <div class="moz-cite-prefix"><font size="-1">On 12/1/20 4:45 PM,

        Paul C. Anagnostopoulos via llvm-dev wrote:</font><br>

    </div>

    <blockquote type="cite"

      cite="mid:202012012349.0B1NnsCM019983@mail283c2.megamailservers.com">

      <pre class="moz-quote-pre" wrap="">Here are some statistics for those of you who like statistics.

Before embarking on a project to speed up access to record fields in TableGen, I thought I'd collect a little data.

Number of records built from the AMDGPU .td files: 55,539

Number of fields in those records: 2,877,918

Average number of fields per record: 52

So the average field lookup sequential scan is about 25 iterations

Number of field lookups performed by DAG ISel emitter: 6,294,426

Number of iterations = 6,294,426 x 25 = 157,000,000 (approx.)

How long does each iteration take? Can it be more than 10 instructions? It's 7 instructions on the X86. So perhaps about 3 ns.? (I may be off here.)

Time saved if the field access could be cut to 0 ns.: 472,000,000 ns.

Next project!

_______________________________________________

LLVM Developers mailing list

<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

    </blockquote>

  </body>

</html>