<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    On 23.10.18 01:42, Artem Belevich wrote:<br>

    <blockquote type="cite"

cite="mid:CA+wKYkOAF16OV+3SMAntta=N1rvAxMy83L526m3_W3Yj5DwMZw@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div dir="ltr">

          <div class="gmail_default"

            style="font-family:verdana,sans-serif"><br>

          </div>

          <br>

          <div class="gmail_quote">

            <div dir="ltr">On Mon, Oct 22, 2018 at 5:45 AM Lorenz Braun

              via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org"

                moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>

              wrote:<br>

            </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

              0.8ex;border-left:1px solid

              rgb(204,204,204);padding-left:1ex">Hi,<br>

              <br>

              i just found out that i can use llc to also produce ptx

              assembly for <br>

              GPUs. I noticed that the produced ptx assembly seems to be

              targeted at <br>

              the gpu architecture sm_20 by default.<br>

            </blockquote>

            <div><br>

            </div>

            <div>

              <div class="gmail_default"

                style="font-family:verdana,sans-serif">This is currently

                the default CPU type for NVPTX back-end.</div>

            </div>

            <div> </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

              0.8ex;border-left:1px solid

              rgb(204,204,204);padding-left:1ex">

              <br>

              Is there a way to explicitly demand different or

              additional target <br>

              architectures like sm_30 for example?<br>

            </blockquote>

            <div><br>

            </div>

            <div>

              <div class="gmail_default"

                style="font-family:verdana,sans-serif">It works the same

                way as for the other back-ends. You specify the CPU

                variant with -mcpu=. E.g. for sm_30 you should use <span

                  style="font-family:Arial,Helvetica,sans-serif">-mcpu=sm_30</span></div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    -mcpu=sm_30 was just what i was looking for. Thank you very much!

    <blockquote type="cite"

cite="mid:CA+wKYkOAF16OV+3SMAntta=N1rvAxMy83L526m3_W3Yj5DwMZw@mail.gmail.com">

      <div dir="ltr">

        <div dir="ltr">

          <div class="gmail_quote">

            <div> </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

              0.8ex;border-left:1px solid

              rgb(204,204,204);padding-left:1ex">

              <br>

              When i compile a cuda kernel with gpu arch. sm_30 using

              clang++ the <br>

              .target directive in the ptx assembly will be set to

              sm_30. However when <br>

              i save the bitcode of the same compilation and hand it to

              llc the <br>

              .target directive is sm_20. There is an attribute in the

              bitcode that <br>

              say "target-cpu"="sm_30". The information that sm_30 is

              required is <br>

              still there. </blockquote>

            <div><br>

            </div>

            <div>

              <div class="gmail_default"

                style="font-family:verdana,sans-serif">It's a *function*

                attribute which, generally speaking, can't be used as

                the default for the whole module. It also does not do

                much in NVPTX back-end. Eventually it will be used to

                enforce that -mcpu=XXX is the same or higher than the

                all target-cpu attributes in a module. This is one of

                the areas where NVPTX can't implement what the attribute

                was intended to do -- target different CPU variants

                within the same module. It's doable on x86 where the

                same ISA can represent instructions for different CPU

                variants, but can't be done in PTX which requires

                everything in the module to be for the same GPU.</div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    Thanks for the info. I understand, when having multiple functions

    with different target-cpus this can only done in multiple modules.

    But i think use cases for this are rather rare.<br>

    <br>

    Best regards<br>

    Lorenz<br>

  </body>

</html>