From libclc-dev at lists.llvm.org Tue Oct 16 16:35:32 2018 From: libclc-dev at lists.llvm.org (Chandler Carruth via Libclc-dev) Date: Tue, 16 Oct 2018 16:35:32 -0700 Subject: [Libclc-dev] LLVM Relicensing Update Message-ID: Greetings, I wanted to provide an update to all the LLVM project (including all of its sub-projects) developers about the ongoing effort to relicense under LLVM under a new, unified license. TL;DR: It’s actually happening. If you are a contributor to LLVM, help us out by filling out our form and signing an agreement to cover any individual contributions you have made: https://goo.gl/forms/X4HiyYRcRHOnTSvC3 All of this information and the latest status can always be found on the relicensing website here: http://llvm.org/foundation/relicensing/ ## Background and Process For background, here is the new license: http://llvm.org/foundation/relicensing/LICENSE.txt The motivation, scope, and discussion of the license itself, please see the most recent thread from Chris on the subject: http://lists.llvm.org/pipermail/llvm-dev/2017-April/112142.html Also, we have the proposed new developer policy discussed here: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116266.html Based on these discussions, there seems clear consensus to move forward, and we (the Foundation) have been working on this for the past year. I want to update folks on the progress and the next steps in the more boring logistics side of this: how do we actually switch. Our plan, roughly outlined when discussing the developer’s policy last year, is to install the new license and the developer policy that references both the new and old license. At that point, all subsequent contributions will be under both licenses. To ensure contributors are aware, we have a two-fold plan: 1) We’re going to get as many active contributors (both companies and individuals) to explicitly sign an agreement to relicense their contributions. This will make the change clear and will cover historical contributions as well. 2) For any remaining contributors, turn off their commit access until we can confirm they are covered by one of the above agreements. We plan to have the *vast majority* of contributors handled via #1 ahead of time, so this will not be disruptive. If necessary, we can delay this to ensure that #1 covers enough of the active contributors. We do not want to unnecessarily disrupt contributions, but we also want to move this forward as fast as we can. For contributors who cannot, for whatever reason, complete the outlined process (#2 above), please send email to license-questions at llvm.org and we'll work, in conjunction with our legal counsel, to find a path forward. Our current planned timeline is to install the new developer policy and the new license after the LLVM 8.0 release branch in January. We will then be focused on getting all of the historical contributions under an agreement to relicense so we can remove the old license(s). ## Relicensing Agreements For #1 to work, we need both individuals and companies to sign an agreement to relicense. The Foundation has worked with our lawyer and built a process for both companies and individuals. For individuals, we’re asking everyone to fill out a form so we have the necessary information (email addresses, potential employers, etc.) to effectively relicense their contributions. It contains a link to a DocuSign agreement to relicense any of their individual contributions under the new license. We’re really hoping that most people will just sign this agreement as it avoids us needing to prove whether every contribution is definitively covered by some company. You can fill out the form and sign the agreement here: https://goo.gl/forms/X4HiyYRcRHOnTSvC3 For companies, we also have a DocuSign agreement: https://na3.docusign.net/Member/PowerFormSigning.aspx?PowerFormId=5a2bb38c-41c4-4ce0-a26e-52a7eb8ae51c We have already reached out to many major companies already, and a few have already signed this agreement. We will be collecting more companies from the form responses and reaching out to them. Feel free to reach out to your employer with the DocuSign link above, but please check the list of companies we’ve already contacted and try to coordinate internally to avoid duplicate work. Once we get the new policy and license in place, we’ll be iterating with these tools until we have everything relicensed, or we have a concrete plan about what to do with any remaining material. ## New File Headers With the new license and developer policy, we also need to update the file headers. The Foundation worked with our lawyer to get a new header approved that is both minimal and functional: ``` //===-- file/name - File description ----------------------------*- C++ -*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // //===----------------------------------------------------------------------===// ``` Some notable aspects: - No explicit copyright notice. After discussion with our lawyer, the value doesn’t seem worthwhile and it avoids the yearly need to update these. - Super compact, but includes things like an SPDX marker to ease automated license analysis. We will install these new file headers at the same time as the new developer policy and license. Thanks all, and don’t hesitate to reach out with any questions! -Chandler (on behalf of the LLVM Foundation) -------------- next part -------------- An HTML attachment was scrubbed... URL: From libclc-dev at lists.llvm.org Sun Oct 28 23:40:26 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:40:26 -0400 Subject: [Libclc-dev] [PATCH 1/4] r600: Convert get_local_size to clc Message-ID: <20181029064029.12312-1-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- This series consolidates existing llvm asm variants to clc file. I've verified that the code generated using llvm-5 is the same for get-locl-size.cl, get-global-size.cl get-num-groups.cl, and global-memory.cl piglits (and all of them pass on Turks). r600/lib/OVERRIDES_3.9 | 1 - r600/lib/OVERRIDES_4.0 | 1 - r600/lib/OVERRIDES_5.0 | 1 - r600/lib/OVERRIDES_6.0 | 1 - r600/lib/SOURCES | 2 +- r600/lib/SOURCES_3.9 | 1 - r600/lib/SOURCES_4.0 | 1 - r600/lib/SOURCES_5.0 | 1 - r600/lib/SOURCES_6.0 | 1 - r600/lib/workitem/get_local_size.39.ll | 20 -------------------- r600/lib/workitem/get_local_size.cl | 15 +++++++++++++++ r600/lib/workitem/get_local_size.ll | 20 -------------------- 12 files changed, 16 insertions(+), 49 deletions(-) delete mode 100644 r600/lib/workitem/get_local_size.39.ll create mode 100644 r600/lib/workitem/get_local_size.cl delete mode 100644 r600/lib/workitem/get_local_size.ll diff --git a/r600/lib/OVERRIDES_3.9 b/r600/lib/OVERRIDES_3.9 index c055c6d..e1a6ae8 100644 --- a/r600/lib/OVERRIDES_3.9 +++ b/r600/lib/OVERRIDES_3.9 @@ -1,4 +1,3 @@ synchronization/barrier_impl.ll workitem/get_global_size.ll -workitem/get_local_size.ll workitem/get_num_groups.ll diff --git a/r600/lib/OVERRIDES_4.0 b/r600/lib/OVERRIDES_4.0 index c055c6d..e1a6ae8 100644 --- a/r600/lib/OVERRIDES_4.0 +++ b/r600/lib/OVERRIDES_4.0 @@ -1,4 +1,3 @@ synchronization/barrier_impl.ll workitem/get_global_size.ll -workitem/get_local_size.ll workitem/get_num_groups.ll diff --git a/r600/lib/OVERRIDES_5.0 b/r600/lib/OVERRIDES_5.0 index c055c6d..e1a6ae8 100644 --- a/r600/lib/OVERRIDES_5.0 +++ b/r600/lib/OVERRIDES_5.0 @@ -1,4 +1,3 @@ synchronization/barrier_impl.ll workitem/get_global_size.ll -workitem/get_local_size.ll workitem/get_num_groups.ll diff --git a/r600/lib/OVERRIDES_6.0 b/r600/lib/OVERRIDES_6.0 index c055c6d..e1a6ae8 100644 --- a/r600/lib/OVERRIDES_6.0 +++ b/r600/lib/OVERRIDES_6.0 @@ -1,4 +1,3 @@ synchronization/barrier_impl.ll workitem/get_global_size.ll -workitem/get_local_size.ll workitem/get_num_groups.ll diff --git a/r600/lib/SOURCES b/r600/lib/SOURCES index e69be4a..75cf901 100644 --- a/r600/lib/SOURCES +++ b/r600/lib/SOURCES @@ -5,6 +5,6 @@ workitem/get_global_offset.cl workitem/get_group_id.cl workitem/get_global_size.ll workitem/get_local_id.cl -workitem/get_local_size.ll +workitem/get_local_size.cl workitem/get_num_groups.ll workitem/get_work_dim.cl diff --git a/r600/lib/SOURCES_3.9 b/r600/lib/SOURCES_3.9 index ba09398..9f36052 100644 --- a/r600/lib/SOURCES_3.9 +++ b/r600/lib/SOURCES_3.9 @@ -15,5 +15,4 @@ image/write_imageui.cl image/write_image_impl.ll synchronization/barrier_impl.39.ll workitem/get_global_size.39.ll -workitem/get_local_size.39.ll workitem/get_num_groups.39.ll diff --git a/r600/lib/SOURCES_4.0 b/r600/lib/SOURCES_4.0 index 091990c..6ca2332 100644 --- a/r600/lib/SOURCES_4.0 +++ b/r600/lib/SOURCES_4.0 @@ -1,4 +1,3 @@ synchronization/barrier_impl.39.ll workitem/get_global_size.39.ll -workitem/get_local_size.39.ll workitem/get_num_groups.39.ll diff --git a/r600/lib/SOURCES_5.0 b/r600/lib/SOURCES_5.0 index 091990c..6ca2332 100644 --- a/r600/lib/SOURCES_5.0 +++ b/r600/lib/SOURCES_5.0 @@ -1,4 +1,3 @@ synchronization/barrier_impl.39.ll workitem/get_global_size.39.ll -workitem/get_local_size.39.ll workitem/get_num_groups.39.ll diff --git a/r600/lib/SOURCES_6.0 b/r600/lib/SOURCES_6.0 index 091990c..6ca2332 100644 --- a/r600/lib/SOURCES_6.0 +++ b/r600/lib/SOURCES_6.0 @@ -1,4 +1,3 @@ synchronization/barrier_impl.39.ll workitem/get_global_size.39.ll -workitem/get_local_size.39.ll workitem/get_num_groups.39.ll diff --git a/r600/lib/workitem/get_local_size.39.ll b/r600/lib/workitem/get_local_size.39.ll deleted file mode 100644 index c9f2c84..0000000 --- a/r600/lib/workitem/get_local_size.39.ll +++ /dev/null @@ -1,20 +0,0 @@ -declare i32 @llvm.r600.read.local.size.x() nounwind readnone -declare i32 @llvm.r600.read.local.size.y() nounwind readnone -declare i32 @llvm.r600.read.local.size.z() nounwind readnone - -target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -define i32 @get_local_size(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.local.size.x() - ret i32 %x -y_dim: - %y = call i32 @llvm.r600.read.local.size.y() - ret i32 %y -z_dim: - %z = call i32 @llvm.r600.read.local.size.z() - ret i32 %z -default: - ret i32 1 -} diff --git a/r600/lib/workitem/get_local_size.cl b/r600/lib/workitem/get_local_size.cl new file mode 100644 index 0000000..89e2612 --- /dev/null +++ b/r600/lib/workitem/get_local_size.cl @@ -0,0 +1,15 @@ +#include + +uint __clc_r600_get_local_size_x(void) __asm("llvm.r600.read.local.size.x"); +uint __clc_r600_get_local_size_y(void) __asm("llvm.r600.read.local.size.y"); +uint __clc_r600_get_local_size_z(void) __asm("llvm.r600.read.local.size.z"); + +_CLC_DEF size_t get_local_size(uint dim) +{ + switch (dim) { + case 0: return __clc_r600_get_local_size_x(); + case 1: return __clc_r600_get_local_size_y(); + case 2: return __clc_r600_get_local_size_z(); + default: return 1; + } +} diff --git a/r600/lib/workitem/get_local_size.ll b/r600/lib/workitem/get_local_size.ll deleted file mode 100644 index 04ce076..0000000 --- a/r600/lib/workitem/get_local_size.ll +++ /dev/null @@ -1,20 +0,0 @@ -declare i32 @llvm.r600.read.local.size.x() nounwind readnone -declare i32 @llvm.r600.read.local.size.y() nounwind readnone -declare i32 @llvm.r600.read.local.size.z() nounwind readnone - -target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5" - -define i32 @get_local_size(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.local.size.x() - ret i32 %x -y_dim: - %y = call i32 @llvm.r600.read.local.size.y() - ret i32 %y -z_dim: - %z = call i32 @llvm.r600.read.local.size.z() - ret i32 %z -default: - ret i32 1 -} -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:40:27 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:40:27 -0400 Subject: [Libclc-dev] [PATCH 2/4] r600: Convert get_global_size to clc In-Reply-To: <20181029064029.12312-1-jan.vesely@rutgers.edu> References: <20181029064029.12312-1-jan.vesely@rutgers.edu> Message-ID: <20181029064029.12312-2-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- r600/lib/OVERRIDES_3.9 | 1 - r600/lib/OVERRIDES_4.0 | 1 - r600/lib/OVERRIDES_5.0 | 1 - r600/lib/OVERRIDES_6.0 | 1 - r600/lib/SOURCES | 2 +- r600/lib/SOURCES_3.9 | 1 - r600/lib/SOURCES_4.0 | 1 - r600/lib/SOURCES_5.0 | 1 - r600/lib/SOURCES_6.0 | 1 - r600/lib/workitem/get_global_size.39.ll | 20 -------------------- r600/lib/workitem/get_global_size.cl | 15 +++++++++++++++ r600/lib/workitem/get_global_size.ll | 20 -------------------- 12 files changed, 16 insertions(+), 49 deletions(-) delete mode 100644 r600/lib/workitem/get_global_size.39.ll create mode 100644 r600/lib/workitem/get_global_size.cl delete mode 100644 r600/lib/workitem/get_global_size.ll diff --git a/r600/lib/OVERRIDES_3.9 b/r600/lib/OVERRIDES_3.9 index e1a6ae8..40638cc 100644 --- a/r600/lib/OVERRIDES_3.9 +++ b/r600/lib/OVERRIDES_3.9 @@ -1,3 +1,2 @@ synchronization/barrier_impl.ll -workitem/get_global_size.ll workitem/get_num_groups.ll diff --git a/r600/lib/OVERRIDES_4.0 b/r600/lib/OVERRIDES_4.0 index e1a6ae8..40638cc 100644 --- a/r600/lib/OVERRIDES_4.0 +++ b/r600/lib/OVERRIDES_4.0 @@ -1,3 +1,2 @@ synchronization/barrier_impl.ll -workitem/get_global_size.ll workitem/get_num_groups.ll diff --git a/r600/lib/OVERRIDES_5.0 b/r600/lib/OVERRIDES_5.0 index e1a6ae8..40638cc 100644 --- a/r600/lib/OVERRIDES_5.0 +++ b/r600/lib/OVERRIDES_5.0 @@ -1,3 +1,2 @@ synchronization/barrier_impl.ll -workitem/get_global_size.ll workitem/get_num_groups.ll diff --git a/r600/lib/OVERRIDES_6.0 b/r600/lib/OVERRIDES_6.0 index e1a6ae8..40638cc 100644 --- a/r600/lib/OVERRIDES_6.0 +++ b/r600/lib/OVERRIDES_6.0 @@ -1,3 +1,2 @@ synchronization/barrier_impl.ll -workitem/get_global_size.ll workitem/get_num_groups.ll diff --git a/r600/lib/SOURCES b/r600/lib/SOURCES index 75cf901..1d8d31d 100644 --- a/r600/lib/SOURCES +++ b/r600/lib/SOURCES @@ -3,7 +3,7 @@ math/fmin.cl synchronization/barrier_impl.ll workitem/get_global_offset.cl workitem/get_group_id.cl -workitem/get_global_size.ll +workitem/get_global_size.cl workitem/get_local_id.cl workitem/get_local_size.cl workitem/get_num_groups.ll diff --git a/r600/lib/SOURCES_3.9 b/r600/lib/SOURCES_3.9 index 9f36052..9348387 100644 --- a/r600/lib/SOURCES_3.9 +++ b/r600/lib/SOURCES_3.9 @@ -14,5 +14,4 @@ image/write_imagei.cl image/write_imageui.cl image/write_image_impl.ll synchronization/barrier_impl.39.ll -workitem/get_global_size.39.ll workitem/get_num_groups.39.ll diff --git a/r600/lib/SOURCES_4.0 b/r600/lib/SOURCES_4.0 index 6ca2332..93d3330 100644 --- a/r600/lib/SOURCES_4.0 +++ b/r600/lib/SOURCES_4.0 @@ -1,3 +1,2 @@ synchronization/barrier_impl.39.ll -workitem/get_global_size.39.ll workitem/get_num_groups.39.ll diff --git a/r600/lib/SOURCES_5.0 b/r600/lib/SOURCES_5.0 index 6ca2332..93d3330 100644 --- a/r600/lib/SOURCES_5.0 +++ b/r600/lib/SOURCES_5.0 @@ -1,3 +1,2 @@ synchronization/barrier_impl.39.ll -workitem/get_global_size.39.ll workitem/get_num_groups.39.ll diff --git a/r600/lib/SOURCES_6.0 b/r600/lib/SOURCES_6.0 index 6ca2332..93d3330 100644 --- a/r600/lib/SOURCES_6.0 +++ b/r600/lib/SOURCES_6.0 @@ -1,3 +1,2 @@ synchronization/barrier_impl.39.ll -workitem/get_global_size.39.ll workitem/get_num_groups.39.ll diff --git a/r600/lib/workitem/get_global_size.39.ll b/r600/lib/workitem/get_global_size.39.ll deleted file mode 100644 index ea58c36..0000000 --- a/r600/lib/workitem/get_global_size.39.ll +++ /dev/null @@ -1,20 +0,0 @@ -declare i32 @llvm.r600.read.global.size.x() nounwind readnone -declare i32 @llvm.r600.read.global.size.y() nounwind readnone -declare i32 @llvm.r600.read.global.size.z() nounwind readnone - -target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -define i32 @get_global_size(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.global.size.x() nounwind readnone - ret i32 %x -y_dim: - %y = call i32 @llvm.r600.read.global.size.y() nounwind readnone - ret i32 %y -z_dim: - %z = call i32 @llvm.r600.read.global.size.z() nounwind readnone - ret i32 %z -default: - ret i32 1 -} diff --git a/r600/lib/workitem/get_global_size.cl b/r600/lib/workitem/get_global_size.cl new file mode 100644 index 0000000..d356929 --- /dev/null +++ b/r600/lib/workitem/get_global_size.cl @@ -0,0 +1,15 @@ +#include + +uint __clc_r600_get_global_size_x(void) __asm("llvm.r600.read.global.size.x"); +uint __clc_r600_get_global_size_y(void) __asm("llvm.r600.read.global.size.y"); +uint __clc_r600_get_global_size_z(void) __asm("llvm.r600.read.global.size.z"); + +_CLC_DEF size_t get_global_size(uint dim) +{ + switch (dim) { + case 0: return __clc_r600_get_global_size_x(); + case 1: return __clc_r600_get_global_size_y(); + case 2: return __clc_r600_get_global_size_z(); + default: return 1; + } +} diff --git a/r600/lib/workitem/get_global_size.ll b/r600/lib/workitem/get_global_size.ll deleted file mode 100644 index d6d10b3..0000000 --- a/r600/lib/workitem/get_global_size.ll +++ /dev/null @@ -1,20 +0,0 @@ -declare i32 @llvm.r600.read.global.size.x() nounwind readnone -declare i32 @llvm.r600.read.global.size.y() nounwind readnone -declare i32 @llvm.r600.read.global.size.z() nounwind readnone - -target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5" - -define i32 @get_global_size(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.global.size.x() nounwind readnone - ret i32 %x -y_dim: - %y = call i32 @llvm.r600.read.global.size.y() nounwind readnone - ret i32 %y -z_dim: - %z = call i32 @llvm.r600.read.global.size.z() nounwind readnone - ret i32 %z -default: - ret i32 1 -} -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:40:28 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:40:28 -0400 Subject: [Libclc-dev] [PATCH 3/4] r600: Convert get_num_groups to clc In-Reply-To: <20181029064029.12312-1-jan.vesely@rutgers.edu> References: <20181029064029.12312-1-jan.vesely@rutgers.edu> Message-ID: <20181029064029.12312-3-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- r600/lib/OVERRIDES_3.9 | 1 - r600/lib/OVERRIDES_4.0 | 1 - r600/lib/OVERRIDES_5.0 | 1 - r600/lib/OVERRIDES_6.0 | 1 - r600/lib/SOURCES | 2 +- r600/lib/SOURCES_3.9 | 1 - r600/lib/SOURCES_4.0 | 1 - r600/lib/SOURCES_5.0 | 1 - r600/lib/SOURCES_6.0 | 1 - r600/lib/workitem/get_num_groups.39.ll | 20 -------------------- r600/lib/workitem/get_num_groups.cl | 15 +++++++++++++++ r600/lib/workitem/get_num_groups.ll | 20 -------------------- 12 files changed, 16 insertions(+), 49 deletions(-) delete mode 100644 r600/lib/workitem/get_num_groups.39.ll create mode 100644 r600/lib/workitem/get_num_groups.cl delete mode 100644 r600/lib/workitem/get_num_groups.ll diff --git a/r600/lib/OVERRIDES_3.9 b/r600/lib/OVERRIDES_3.9 index 40638cc..c99f3fc 100644 --- a/r600/lib/OVERRIDES_3.9 +++ b/r600/lib/OVERRIDES_3.9 @@ -1,2 +1 @@ synchronization/barrier_impl.ll -workitem/get_num_groups.ll diff --git a/r600/lib/OVERRIDES_4.0 b/r600/lib/OVERRIDES_4.0 index 40638cc..c99f3fc 100644 --- a/r600/lib/OVERRIDES_4.0 +++ b/r600/lib/OVERRIDES_4.0 @@ -1,2 +1 @@ synchronization/barrier_impl.ll -workitem/get_num_groups.ll diff --git a/r600/lib/OVERRIDES_5.0 b/r600/lib/OVERRIDES_5.0 index 40638cc..c99f3fc 100644 --- a/r600/lib/OVERRIDES_5.0 +++ b/r600/lib/OVERRIDES_5.0 @@ -1,2 +1 @@ synchronization/barrier_impl.ll -workitem/get_num_groups.ll diff --git a/r600/lib/OVERRIDES_6.0 b/r600/lib/OVERRIDES_6.0 index 40638cc..c99f3fc 100644 --- a/r600/lib/OVERRIDES_6.0 +++ b/r600/lib/OVERRIDES_6.0 @@ -1,2 +1 @@ synchronization/barrier_impl.ll -workitem/get_num_groups.ll diff --git a/r600/lib/SOURCES b/r600/lib/SOURCES index 1d8d31d..b3180ed 100644 --- a/r600/lib/SOURCES +++ b/r600/lib/SOURCES @@ -6,5 +6,5 @@ workitem/get_group_id.cl workitem/get_global_size.cl workitem/get_local_id.cl workitem/get_local_size.cl -workitem/get_num_groups.ll +workitem/get_num_groups.cl workitem/get_work_dim.cl diff --git a/r600/lib/SOURCES_3.9 b/r600/lib/SOURCES_3.9 index 9348387..560a86d 100644 --- a/r600/lib/SOURCES_3.9 +++ b/r600/lib/SOURCES_3.9 @@ -14,4 +14,3 @@ image/write_imagei.cl image/write_imageui.cl image/write_image_impl.ll synchronization/barrier_impl.39.ll -workitem/get_num_groups.39.ll diff --git a/r600/lib/SOURCES_4.0 b/r600/lib/SOURCES_4.0 index 93d3330..3c56d80 100644 --- a/r600/lib/SOURCES_4.0 +++ b/r600/lib/SOURCES_4.0 @@ -1,2 +1 @@ synchronization/barrier_impl.39.ll -workitem/get_num_groups.39.ll diff --git a/r600/lib/SOURCES_5.0 b/r600/lib/SOURCES_5.0 index 93d3330..3c56d80 100644 --- a/r600/lib/SOURCES_5.0 +++ b/r600/lib/SOURCES_5.0 @@ -1,2 +1 @@ synchronization/barrier_impl.39.ll -workitem/get_num_groups.39.ll diff --git a/r600/lib/SOURCES_6.0 b/r600/lib/SOURCES_6.0 index 93d3330..3c56d80 100644 --- a/r600/lib/SOURCES_6.0 +++ b/r600/lib/SOURCES_6.0 @@ -1,2 +1 @@ synchronization/barrier_impl.39.ll -workitem/get_num_groups.39.ll diff --git a/r600/lib/workitem/get_num_groups.39.ll b/r600/lib/workitem/get_num_groups.39.ll deleted file mode 100644 index 74ca78b..0000000 --- a/r600/lib/workitem/get_num_groups.39.ll +++ /dev/null @@ -1,20 +0,0 @@ -declare i32 @llvm.r600.read.ngroups.x() nounwind readnone -declare i32 @llvm.r600.read.ngroups.y() nounwind readnone -declare i32 @llvm.r600.read.ngroups.z() nounwind readnone - -target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -define i32 @get_num_groups(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.ngroups.x() nounwind readnone - ret i32 %x -y_dim: - %y = call i32 @llvm.r600.read.ngroups.y() nounwind readnone - ret i32 %y -z_dim: - %z = call i32 @llvm.r600.read.ngroups.z() nounwind readnone - ret i32 %z -default: - ret i32 1 -} diff --git a/r600/lib/workitem/get_num_groups.cl b/r600/lib/workitem/get_num_groups.cl new file mode 100644 index 0000000..dfe6cef --- /dev/null +++ b/r600/lib/workitem/get_num_groups.cl @@ -0,0 +1,15 @@ +#include + +uint __clc_r600_get_num_groups_x(void) __asm("llvm.r600.read.ngroups.x"); +uint __clc_r600_get_num_groups_y(void) __asm("llvm.r600.read.ngroups.y"); +uint __clc_r600_get_num_groups_z(void) __asm("llvm.r600.read.ngroups.z"); + +_CLC_DEF size_t get_num_groups(uint dim) +{ + switch (dim) { + case 0: return __clc_r600_get_num_groups_x(); + case 1: return __clc_r600_get_num_groups_y(); + case 2: return __clc_r600_get_num_groups_z(); + default: return 1; + } +} diff --git a/r600/lib/workitem/get_num_groups.ll b/r600/lib/workitem/get_num_groups.ll deleted file mode 100644 index b31f4cf..0000000 --- a/r600/lib/workitem/get_num_groups.ll +++ /dev/null @@ -1,20 +0,0 @@ -declare i32 @llvm.r600.read.ngroups.x() nounwind readnone -declare i32 @llvm.r600.read.ngroups.y() nounwind readnone -declare i32 @llvm.r600.read.ngroups.z() nounwind readnone - -target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5" - -define i32 @get_num_groups(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.ngroups.x() nounwind readnone - ret i32 %x -y_dim: - %y = call i32 @llvm.r600.read.ngroups.y() nounwind readnone - ret i32 %y -z_dim: - %z = call i32 @llvm.r600.read.ngroups.z() nounwind readnone - ret i32 %z -default: - ret i32 1 -} -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:40:29 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:40:29 -0400 Subject: [Libclc-dev] [PATCH 4/4] r600: Convert barrier to clc In-Reply-To: <20181029064029.12312-1-jan.vesely@rutgers.edu> References: <20181029064029.12312-1-jan.vesely@rutgers.edu> Message-ID: <20181029064029.12312-4-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- r600/lib/OVERRIDES_3.9 | 1 - r600/lib/OVERRIDES_4.0 | 1 - r600/lib/OVERRIDES_5.0 | 1 - r600/lib/OVERRIDES_6.0 | 1 - r600/lib/SOURCES | 2 +- r600/lib/SOURCES_3.9 | 1 - r600/lib/SOURCES_4.0 | 1 - r600/lib/SOURCES_5.0 | 1 - r600/lib/SOURCES_6.0 | 1 - r600/lib/synchronization/barrier.cl | 9 +++++++++ r600/lib/synchronization/barrier_impl.39.ll | 13 ------------- r600/lib/synchronization/barrier_impl.ll | 13 ------------- 12 files changed, 10 insertions(+), 35 deletions(-) delete mode 100644 r600/lib/OVERRIDES_3.9 delete mode 100644 r600/lib/OVERRIDES_4.0 delete mode 100644 r600/lib/OVERRIDES_5.0 delete mode 100644 r600/lib/OVERRIDES_6.0 delete mode 100644 r600/lib/SOURCES_4.0 delete mode 100644 r600/lib/SOURCES_5.0 delete mode 100644 r600/lib/SOURCES_6.0 create mode 100644 r600/lib/synchronization/barrier.cl delete mode 100644 r600/lib/synchronization/barrier_impl.39.ll delete mode 100644 r600/lib/synchronization/barrier_impl.ll diff --git a/r600/lib/OVERRIDES_3.9 b/r600/lib/OVERRIDES_3.9 deleted file mode 100644 index c99f3fc..0000000 --- a/r600/lib/OVERRIDES_3.9 +++ /dev/null @@ -1 +0,0 @@ -synchronization/barrier_impl.ll diff --git a/r600/lib/OVERRIDES_4.0 b/r600/lib/OVERRIDES_4.0 deleted file mode 100644 index c99f3fc..0000000 --- a/r600/lib/OVERRIDES_4.0 +++ /dev/null @@ -1 +0,0 @@ -synchronization/barrier_impl.ll diff --git a/r600/lib/OVERRIDES_5.0 b/r600/lib/OVERRIDES_5.0 deleted file mode 100644 index c99f3fc..0000000 --- a/r600/lib/OVERRIDES_5.0 +++ /dev/null @@ -1 +0,0 @@ -synchronization/barrier_impl.ll diff --git a/r600/lib/OVERRIDES_6.0 b/r600/lib/OVERRIDES_6.0 deleted file mode 100644 index c99f3fc..0000000 --- a/r600/lib/OVERRIDES_6.0 +++ /dev/null @@ -1 +0,0 @@ -synchronization/barrier_impl.ll diff --git a/r600/lib/SOURCES b/r600/lib/SOURCES index b3180ed..4342ac3 100644 --- a/r600/lib/SOURCES +++ b/r600/lib/SOURCES @@ -1,6 +1,6 @@ math/fmax.cl math/fmin.cl -synchronization/barrier_impl.ll +synchronization/barrier.cl workitem/get_global_offset.cl workitem/get_group_id.cl workitem/get_global_size.cl diff --git a/r600/lib/SOURCES_3.9 b/r600/lib/SOURCES_3.9 index 560a86d..a44a9ce 100644 --- a/r600/lib/SOURCES_3.9 +++ b/r600/lib/SOURCES_3.9 @@ -13,4 +13,3 @@ image/write_imagef.cl image/write_imagei.cl image/write_imageui.cl image/write_image_impl.ll -synchronization/barrier_impl.39.ll diff --git a/r600/lib/SOURCES_4.0 b/r600/lib/SOURCES_4.0 deleted file mode 100644 index 3c56d80..0000000 --- a/r600/lib/SOURCES_4.0 +++ /dev/null @@ -1 +0,0 @@ -synchronization/barrier_impl.39.ll diff --git a/r600/lib/SOURCES_5.0 b/r600/lib/SOURCES_5.0 deleted file mode 100644 index 3c56d80..0000000 --- a/r600/lib/SOURCES_5.0 +++ /dev/null @@ -1 +0,0 @@ -synchronization/barrier_impl.39.ll diff --git a/r600/lib/SOURCES_6.0 b/r600/lib/SOURCES_6.0 deleted file mode 100644 index 3c56d80..0000000 --- a/r600/lib/SOURCES_6.0 +++ /dev/null @@ -1 +0,0 @@ -synchronization/barrier_impl.39.ll diff --git a/r600/lib/synchronization/barrier.cl b/r600/lib/synchronization/barrier.cl new file mode 100644 index 0000000..98200e7 --- /dev/null +++ b/r600/lib/synchronization/barrier.cl @@ -0,0 +1,9 @@ +#include + +_CLC_DEF void __clc_r600_barrier(void) __asm("llvm.r600.group.barrier"); + +_CLC_DEF void barrier(uint flags) +{ + // We should call mem_fence here, but that is not implemented for r600 yet + __clc_r600_barrier(); +} diff --git a/r600/lib/synchronization/barrier_impl.39.ll b/r600/lib/synchronization/barrier_impl.39.ll deleted file mode 100644 index 3bd3167..0000000 --- a/r600/lib/synchronization/barrier_impl.39.ll +++ /dev/null @@ -1,13 +0,0 @@ -declare void @llvm.r600.group.barrier() #0 - -target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -define void @barrier(i32 %flags) #1 { -entry: - ; We should call mem_fence here, but that is not implemented for r600 yet - tail call void @llvm.r600.group.barrier() - ret void -} - -attributes #0 = { nounwind convergent } -attributes #1 = { nounwind convergent alwaysinline } diff --git a/r600/lib/synchronization/barrier_impl.ll b/r600/lib/synchronization/barrier_impl.ll deleted file mode 100644 index f1cbc9a..0000000 --- a/r600/lib/synchronization/barrier_impl.ll +++ /dev/null @@ -1,13 +0,0 @@ -declare void @llvm.r600.group.barrier() #0 - -target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5" - -define void @barrier(i32 %flags) #1 { -entry: - ; We should call mem_fence here, but that is not implemented for r600 yet - tail call void @llvm.r600.group.barrier() - ret void -} - -attributes #0 = { nounwind convergent } -attributes #1 = { nounwind convergent alwaysinline } -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:43:01 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:43:01 -0400 Subject: [Libclc-dev] [PATCH 1/5] amdgcn: Convert get_local_size to clc Message-ID: <20181029064305.12484-1-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- Similar to the previous series, this one consolidates various llvm asm implementations into clc files. The code generated by llvm5 for global-memory.cl and local-memory.cl piglits is identical. The code for get-global-size.cl, get-local-size.cl, and get-num-groups.cl is slightly different as the current code added an extra 'and' operation when retrieving Zdimension local-size: %z_size = load i32, i32 addrspace(2)* %z_size_ptr, align 4, !invariant.load !0, !range !1 %z_size.ext = zext i32 %z_size to i64 was changed to: return ptr[2] & 0xffffu; Although I don't think the high bytes will every be anythin other than zeros, I think the latter is a bit more correct. amdgcn/lib/OVERRIDES_3.9 | 1 - amdgcn/lib/OVERRIDES_4.0 | 1 - amdgcn/lib/OVERRIDES_5.0 | 1 - amdgcn/lib/OVERRIDES_6.0 | 1 - amdgcn/lib/SOURCES | 2 +- amdgcn/lib/SOURCES_3.9 | 1 - amdgcn/lib/SOURCES_4.0 | 1 - amdgcn/lib/SOURCES_5.0 | 1 - amdgcn/lib/SOURCES_6.0 | 1 - amdgcn/lib/workitem/get_local_size.39.ll | 20 -------------------- amdgcn/lib/workitem/get_local_size.40.ll | 23 ----------------------- amdgcn/lib/workitem/get_local_size.cl | 15 +++++++++++++++ amdgcn/lib/workitem/get_local_size.ll | 23 ----------------------- 13 files changed, 16 insertions(+), 75 deletions(-) delete mode 100644 amdgcn/lib/workitem/get_local_size.39.ll delete mode 100644 amdgcn/lib/workitem/get_local_size.40.ll create mode 100644 amdgcn/lib/workitem/get_local_size.cl delete mode 100644 amdgcn/lib/workitem/get_local_size.ll diff --git a/amdgcn/lib/OVERRIDES_3.9 b/amdgcn/lib/OVERRIDES_3.9 index 3268f67..ed6c06d 100644 --- a/amdgcn/lib/OVERRIDES_3.9 +++ b/amdgcn/lib/OVERRIDES_3.9 @@ -1,4 +1,3 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll workitem/get_global_size.ll -workitem/get_local_size.ll workitem/get_num_groups.ll diff --git a/amdgcn/lib/OVERRIDES_4.0 b/amdgcn/lib/OVERRIDES_4.0 index 3268f67..ed6c06d 100644 --- a/amdgcn/lib/OVERRIDES_4.0 +++ b/amdgcn/lib/OVERRIDES_4.0 @@ -1,4 +1,3 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll workitem/get_global_size.ll -workitem/get_local_size.ll workitem/get_num_groups.ll diff --git a/amdgcn/lib/OVERRIDES_5.0 b/amdgcn/lib/OVERRIDES_5.0 index 3268f67..ed6c06d 100644 --- a/amdgcn/lib/OVERRIDES_5.0 +++ b/amdgcn/lib/OVERRIDES_5.0 @@ -1,4 +1,3 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll workitem/get_global_size.ll -workitem/get_local_size.ll workitem/get_num_groups.ll diff --git a/amdgcn/lib/OVERRIDES_6.0 b/amdgcn/lib/OVERRIDES_6.0 index 3268f67..ed6c06d 100644 --- a/amdgcn/lib/OVERRIDES_6.0 +++ b/amdgcn/lib/OVERRIDES_6.0 @@ -1,4 +1,3 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll workitem/get_global_size.ll -workitem/get_local_size.ll workitem/get_num_groups.ll diff --git a/amdgcn/lib/SOURCES b/amdgcn/lib/SOURCES index f78a272..8e9fdde 100644 --- a/amdgcn/lib/SOURCES +++ b/amdgcn/lib/SOURCES @@ -12,6 +12,6 @@ workitem/get_global_offset.cl workitem/get_group_id.cl workitem/get_global_size.ll workitem/get_local_id.cl -workitem/get_local_size.ll +workitem/get_local_size.cl workitem/get_num_groups.ll workitem/get_work_dim.cl diff --git a/amdgcn/lib/SOURCES_3.9 b/amdgcn/lib/SOURCES_3.9 index 3cecdb0..8b2a255 100644 --- a/amdgcn/lib/SOURCES_3.9 +++ b/amdgcn/lib/SOURCES_3.9 @@ -1,5 +1,4 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll mem_fence/waitcnt.ll workitem/get_global_size.39.ll -workitem/get_local_size.39.ll workitem/get_num_groups.39.ll diff --git a/amdgcn/lib/SOURCES_4.0 b/amdgcn/lib/SOURCES_4.0 index 5ed1d7c..5342d54 100644 --- a/amdgcn/lib/SOURCES_4.0 +++ b/amdgcn/lib/SOURCES_4.0 @@ -1,5 +1,4 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll mem_fence/waitcnt.ll workitem/get_global_size.40.ll -workitem/get_local_size.40.ll workitem/get_num_groups.40.ll diff --git a/amdgcn/lib/SOURCES_5.0 b/amdgcn/lib/SOURCES_5.0 index 45c51ec..0977b32 100644 --- a/amdgcn/lib/SOURCES_5.0 +++ b/amdgcn/lib/SOURCES_5.0 @@ -1,4 +1,3 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll workitem/get_global_size.40.ll -workitem/get_local_size.40.ll workitem/get_num_groups.40.ll diff --git a/amdgcn/lib/SOURCES_6.0 b/amdgcn/lib/SOURCES_6.0 index 45c51ec..0977b32 100644 --- a/amdgcn/lib/SOURCES_6.0 +++ b/amdgcn/lib/SOURCES_6.0 @@ -1,4 +1,3 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll workitem/get_global_size.40.ll -workitem/get_local_size.40.ll workitem/get_num_groups.40.ll diff --git a/amdgcn/lib/workitem/get_local_size.39.ll b/amdgcn/lib/workitem/get_local_size.39.ll deleted file mode 100644 index 4fe483a..0000000 --- a/amdgcn/lib/workitem/get_local_size.39.ll +++ /dev/null @@ -1,20 +0,0 @@ -declare i32 @llvm.r600.read.local.size.x() nounwind readnone -declare i32 @llvm.r600.read.local.size.y() nounwind readnone -declare i32 @llvm.r600.read.local.size.z() nounwind readnone - -target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -define i32 @get_local_size(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.local.size.x() - ret i32 %x -y_dim: - %y = call i32 @llvm.r600.read.local.size.y() - ret i32 %y -z_dim: - %z = call i32 @llvm.r600.read.local.size.z() - ret i32 %z -default: - ret i32 1 -} diff --git a/amdgcn/lib/workitem/get_local_size.40.ll b/amdgcn/lib/workitem/get_local_size.40.ll deleted file mode 100644 index 36141f9..0000000 --- a/amdgcn/lib/workitem/get_local_size.40.ll +++ /dev/null @@ -1,23 +0,0 @@ -declare i32 @llvm.r600.read.local.size.x() nounwind readnone -declare i32 @llvm.r600.read.local.size.y() nounwind readnone -declare i32 @llvm.r600.read.local.size.z() nounwind readnone - -target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -define i64 @get_local_size(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.local.size.x() - %x.ext = zext i32 %x to i64 - ret i64 %x.ext -y_dim: - %y = call i32 @llvm.r600.read.local.size.y() - %y.ext = zext i32 %y to i64 - ret i64 %y.ext -z_dim: - %z = call i32 @llvm.r600.read.local.size.z() - %z.ext = zext i32 %z to i64 - ret i64 %z.ext -default: - ret i64 1 -} diff --git a/amdgcn/lib/workitem/get_local_size.cl b/amdgcn/lib/workitem/get_local_size.cl new file mode 100644 index 0000000..9b19f6b --- /dev/null +++ b/amdgcn/lib/workitem/get_local_size.cl @@ -0,0 +1,15 @@ +#include + +uint __clc_amdgcn_get_local_size_x(void) __asm("llvm.r600.read.local.size.x"); +uint __clc_amdgcn_get_local_size_y(void) __asm("llvm.r600.read.local.size.y"); +uint __clc_amdgcn_get_local_size_z(void) __asm("llvm.r600.read.local.size.z"); + +_CLC_DEF size_t get_local_size(uint dim) +{ + switch (dim) { + case 0: return __clc_amdgcn_get_local_size_x(); + case 1: return __clc_amdgcn_get_local_size_y(); + case 2: return __clc_amdgcn_get_local_size_z(); + default: return 1; + } +} diff --git a/amdgcn/lib/workitem/get_local_size.ll b/amdgcn/lib/workitem/get_local_size.ll deleted file mode 100644 index 988c4cb..0000000 --- a/amdgcn/lib/workitem/get_local_size.ll +++ /dev/null @@ -1,23 +0,0 @@ -declare i32 @llvm.r600.read.local.size.x() nounwind readnone -declare i32 @llvm.r600.read.local.size.y() nounwind readnone -declare i32 @llvm.r600.read.local.size.z() nounwind readnone - -target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5" - -define i64 @get_local_size(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.local.size.x() - %x.ext = zext i32 %x to i64 - ret i64 %x.ext -y_dim: - %y = call i32 @llvm.r600.read.local.size.y() - %y.ext = zext i32 %y to i64 - ret i64 %y.ext -z_dim: - %z = call i32 @llvm.r600.read.local.size.z() - %z.ext = zext i32 %z to i64 - ret i64 %z.ext -default: - ret i64 1 -} -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:43:02 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:43:02 -0400 Subject: [Libclc-dev] [PATCH 2/5] amdgcn: Convert get_global_size to clc In-Reply-To: <20181029064305.12484-1-jan.vesely@rutgers.edu> References: <20181029064305.12484-1-jan.vesely@rutgers.edu> Message-ID: <20181029064305.12484-2-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- amdgcn/lib/OVERRIDES_3.9 | 1 - amdgcn/lib/OVERRIDES_4.0 | 1 - amdgcn/lib/OVERRIDES_5.0 | 1 - amdgcn/lib/OVERRIDES_6.0 | 1 - amdgcn/lib/SOURCES | 2 +- amdgcn/lib/SOURCES_3.9 | 1 - amdgcn/lib/SOURCES_4.0 | 1 - amdgcn/lib/SOURCES_5.0 | 1 - amdgcn/lib/SOURCES_6.0 | 1 - amdgcn/lib/workitem/get_global_size.39.ll | 20 -------------------- amdgcn/lib/workitem/get_global_size.40.ll | 23 ----------------------- amdgcn/lib/workitem/get_global_size.cl | 15 +++++++++++++++ amdgcn/lib/workitem/get_global_size.ll | 23 ----------------------- 13 files changed, 16 insertions(+), 75 deletions(-) delete mode 100644 amdgcn/lib/workitem/get_global_size.39.ll delete mode 100644 amdgcn/lib/workitem/get_global_size.40.ll create mode 100644 amdgcn/lib/workitem/get_global_size.cl delete mode 100644 amdgcn/lib/workitem/get_global_size.ll diff --git a/amdgcn/lib/OVERRIDES_3.9 b/amdgcn/lib/OVERRIDES_3.9 index ed6c06d..4811cf0 100644 --- a/amdgcn/lib/OVERRIDES_3.9 +++ b/amdgcn/lib/OVERRIDES_3.9 @@ -1,3 +1,2 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll -workitem/get_global_size.ll workitem/get_num_groups.ll diff --git a/amdgcn/lib/OVERRIDES_4.0 b/amdgcn/lib/OVERRIDES_4.0 index ed6c06d..4811cf0 100644 --- a/amdgcn/lib/OVERRIDES_4.0 +++ b/amdgcn/lib/OVERRIDES_4.0 @@ -1,3 +1,2 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll -workitem/get_global_size.ll workitem/get_num_groups.ll diff --git a/amdgcn/lib/OVERRIDES_5.0 b/amdgcn/lib/OVERRIDES_5.0 index ed6c06d..4811cf0 100644 --- a/amdgcn/lib/OVERRIDES_5.0 +++ b/amdgcn/lib/OVERRIDES_5.0 @@ -1,3 +1,2 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll -workitem/get_global_size.ll workitem/get_num_groups.ll diff --git a/amdgcn/lib/OVERRIDES_6.0 b/amdgcn/lib/OVERRIDES_6.0 index ed6c06d..4811cf0 100644 --- a/amdgcn/lib/OVERRIDES_6.0 +++ b/amdgcn/lib/OVERRIDES_6.0 @@ -1,3 +1,2 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll -workitem/get_global_size.ll workitem/get_num_groups.ll diff --git a/amdgcn/lib/SOURCES b/amdgcn/lib/SOURCES index 8e9fdde..666ae10 100644 --- a/amdgcn/lib/SOURCES +++ b/amdgcn/lib/SOURCES @@ -10,7 +10,7 @@ mem_fence/fence.cl synchronization/barrier.cl workitem/get_global_offset.cl workitem/get_group_id.cl -workitem/get_global_size.ll +workitem/get_global_size.cl workitem/get_local_id.cl workitem/get_local_size.cl workitem/get_num_groups.ll diff --git a/amdgcn/lib/SOURCES_3.9 b/amdgcn/lib/SOURCES_3.9 index 8b2a255..f4b800f 100644 --- a/amdgcn/lib/SOURCES_3.9 +++ b/amdgcn/lib/SOURCES_3.9 @@ -1,4 +1,3 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll mem_fence/waitcnt.ll -workitem/get_global_size.39.ll workitem/get_num_groups.39.ll diff --git a/amdgcn/lib/SOURCES_4.0 b/amdgcn/lib/SOURCES_4.0 index 5342d54..34f98fe 100644 --- a/amdgcn/lib/SOURCES_4.0 +++ b/amdgcn/lib/SOURCES_4.0 @@ -1,4 +1,3 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll mem_fence/waitcnt.ll -workitem/get_global_size.40.ll workitem/get_num_groups.40.ll diff --git a/amdgcn/lib/SOURCES_5.0 b/amdgcn/lib/SOURCES_5.0 index 0977b32..ed28126 100644 --- a/amdgcn/lib/SOURCES_5.0 +++ b/amdgcn/lib/SOURCES_5.0 @@ -1,3 +1,2 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll -workitem/get_global_size.40.ll workitem/get_num_groups.40.ll diff --git a/amdgcn/lib/SOURCES_6.0 b/amdgcn/lib/SOURCES_6.0 index 0977b32..ed28126 100644 --- a/amdgcn/lib/SOURCES_6.0 +++ b/amdgcn/lib/SOURCES_6.0 @@ -1,3 +1,2 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll -workitem/get_global_size.40.ll workitem/get_num_groups.40.ll diff --git a/amdgcn/lib/workitem/get_global_size.39.ll b/amdgcn/lib/workitem/get_global_size.39.ll deleted file mode 100644 index 967d541..0000000 --- a/amdgcn/lib/workitem/get_global_size.39.ll +++ /dev/null @@ -1,20 +0,0 @@ -declare i32 @llvm.r600.read.global.size.x() nounwind readnone -declare i32 @llvm.r600.read.global.size.y() nounwind readnone -declare i32 @llvm.r600.read.global.size.z() nounwind readnone - -target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -define i32 @get_global_size(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.global.size.x() - ret i32 %x -y_dim: - %y = call i32 @llvm.r600.read.global.size.y() - ret i32 %y -z_dim: - %z = call i32 @llvm.r600.read.global.size.z() - ret i32 %z -default: - ret i32 1 -} diff --git a/amdgcn/lib/workitem/get_global_size.40.ll b/amdgcn/lib/workitem/get_global_size.40.ll deleted file mode 100644 index 3d26d2f..0000000 --- a/amdgcn/lib/workitem/get_global_size.40.ll +++ /dev/null @@ -1,23 +0,0 @@ -declare i32 @llvm.r600.read.global.size.x() nounwind readnone -declare i32 @llvm.r600.read.global.size.y() nounwind readnone -declare i32 @llvm.r600.read.global.size.z() nounwind readnone - -target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -define i64 @get_global_size(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.global.size.x() - %x.ext = zext i32 %x to i64 - ret i64 %x.ext -y_dim: - %y = call i32 @llvm.r600.read.global.size.y() - %y.ext = zext i32 %y to i64 - ret i64 %y.ext -z_dim: - %z = call i32 @llvm.r600.read.global.size.z() - %z.ext = zext i32 %z to i64 - ret i64 %z.ext -default: - ret i64 1 -} diff --git a/amdgcn/lib/workitem/get_global_size.cl b/amdgcn/lib/workitem/get_global_size.cl new file mode 100644 index 0000000..c1e3894 --- /dev/null +++ b/amdgcn/lib/workitem/get_global_size.cl @@ -0,0 +1,15 @@ +#include + +uint __clc_amdgcn_get_global_size_x(void) __asm("llvm.r600.read.global.size.x"); +uint __clc_amdgcn_get_global_size_y(void) __asm("llvm.r600.read.global.size.y"); +uint __clc_amdgcn_get_global_size_z(void) __asm("llvm.r600.read.global.size.z"); + +_CLC_DEF size_t get_global_size(uint dim) +{ + switch (dim) { + case 0: return __clc_amdgcn_get_global_size_x(); + case 1: return __clc_amdgcn_get_global_size_y(); + case 2: return __clc_amdgcn_get_global_size_z(); + default: return 1; + } +} diff --git a/amdgcn/lib/workitem/get_global_size.ll b/amdgcn/lib/workitem/get_global_size.ll deleted file mode 100644 index 43788ed..0000000 --- a/amdgcn/lib/workitem/get_global_size.ll +++ /dev/null @@ -1,23 +0,0 @@ -declare i32 @llvm.r600.read.global.size.x() nounwind readnone -declare i32 @llvm.r600.read.global.size.y() nounwind readnone -declare i32 @llvm.r600.read.global.size.z() nounwind readnone - -target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5" - -define i64 @get_global_size(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.global.size.x() - %x.ext = zext i32 %x to i64 - ret i64 %x.ext -y_dim: - %y = call i32 @llvm.r600.read.global.size.y() - %y.ext = zext i32 %y to i64 - ret i64 %y.ext -z_dim: - %z = call i32 @llvm.r600.read.global.size.z() - %z.ext = zext i32 %z to i64 - ret i64 %z.ext -default: - ret i64 1 -} -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:43:03 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:43:03 -0400 Subject: [Libclc-dev] [PATCH 3/5] amdgcn: Convert get_num_groups to clc In-Reply-To: <20181029064305.12484-1-jan.vesely@rutgers.edu> References: <20181029064305.12484-1-jan.vesely@rutgers.edu> Message-ID: <20181029064305.12484-3-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- amdgcn/lib/OVERRIDES_3.9 | 1 - amdgcn/lib/OVERRIDES_4.0 | 1 - amdgcn/lib/OVERRIDES_5.0 | 1 - amdgcn/lib/OVERRIDES_6.0 | 1 - amdgcn/lib/SOURCES | 2 +- amdgcn/lib/SOURCES_3.9 | 1 - amdgcn/lib/SOURCES_4.0 | 1 - amdgcn/lib/SOURCES_5.0 | 1 - amdgcn/lib/SOURCES_6.0 | 1 - amdgcn/lib/workitem/get_num_groups.39.ll | 20 -------------------- amdgcn/lib/workitem/get_num_groups.40.ll | 23 ----------------------- amdgcn/lib/workitem/get_num_groups.cl | 15 +++++++++++++++ amdgcn/lib/workitem/get_num_groups.ll | 23 ----------------------- 13 files changed, 16 insertions(+), 75 deletions(-) delete mode 100644 amdgcn/lib/workitem/get_num_groups.39.ll delete mode 100644 amdgcn/lib/workitem/get_num_groups.40.ll create mode 100644 amdgcn/lib/workitem/get_num_groups.cl delete mode 100644 amdgcn/lib/workitem/get_num_groups.ll diff --git a/amdgcn/lib/OVERRIDES_3.9 b/amdgcn/lib/OVERRIDES_3.9 index 4811cf0..cf58849 100644 --- a/amdgcn/lib/OVERRIDES_3.9 +++ b/amdgcn/lib/OVERRIDES_3.9 @@ -1,2 +1 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll -workitem/get_num_groups.ll diff --git a/amdgcn/lib/OVERRIDES_4.0 b/amdgcn/lib/OVERRIDES_4.0 index 4811cf0..cf58849 100644 --- a/amdgcn/lib/OVERRIDES_4.0 +++ b/amdgcn/lib/OVERRIDES_4.0 @@ -1,2 +1 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll -workitem/get_num_groups.ll diff --git a/amdgcn/lib/OVERRIDES_5.0 b/amdgcn/lib/OVERRIDES_5.0 index 4811cf0..cf58849 100644 --- a/amdgcn/lib/OVERRIDES_5.0 +++ b/amdgcn/lib/OVERRIDES_5.0 @@ -1,2 +1 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll -workitem/get_num_groups.ll diff --git a/amdgcn/lib/OVERRIDES_6.0 b/amdgcn/lib/OVERRIDES_6.0 index 4811cf0..cf58849 100644 --- a/amdgcn/lib/OVERRIDES_6.0 +++ b/amdgcn/lib/OVERRIDES_6.0 @@ -1,2 +1 @@ cl_khr_int64_extended_atomics/minmax_helpers.ll -workitem/get_num_groups.ll diff --git a/amdgcn/lib/SOURCES b/amdgcn/lib/SOURCES index 666ae10..89e694d 100644 --- a/amdgcn/lib/SOURCES +++ b/amdgcn/lib/SOURCES @@ -13,5 +13,5 @@ workitem/get_group_id.cl workitem/get_global_size.cl workitem/get_local_id.cl workitem/get_local_size.cl -workitem/get_num_groups.ll +workitem/get_num_groups.cl workitem/get_work_dim.cl diff --git a/amdgcn/lib/SOURCES_3.9 b/amdgcn/lib/SOURCES_3.9 index f4b800f..86a222e 100644 --- a/amdgcn/lib/SOURCES_3.9 +++ b/amdgcn/lib/SOURCES_3.9 @@ -1,3 +1,2 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll mem_fence/waitcnt.ll -workitem/get_num_groups.39.ll diff --git a/amdgcn/lib/SOURCES_4.0 b/amdgcn/lib/SOURCES_4.0 index 34f98fe..86a222e 100644 --- a/amdgcn/lib/SOURCES_4.0 +++ b/amdgcn/lib/SOURCES_4.0 @@ -1,3 +1,2 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll mem_fence/waitcnt.ll -workitem/get_num_groups.40.ll diff --git a/amdgcn/lib/SOURCES_5.0 b/amdgcn/lib/SOURCES_5.0 index ed28126..c97d406 100644 --- a/amdgcn/lib/SOURCES_5.0 +++ b/amdgcn/lib/SOURCES_5.0 @@ -1,2 +1 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll -workitem/get_num_groups.40.ll diff --git a/amdgcn/lib/SOURCES_6.0 b/amdgcn/lib/SOURCES_6.0 index ed28126..c97d406 100644 --- a/amdgcn/lib/SOURCES_6.0 +++ b/amdgcn/lib/SOURCES_6.0 @@ -1,2 +1 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll -workitem/get_num_groups.40.ll diff --git a/amdgcn/lib/workitem/get_num_groups.39.ll b/amdgcn/lib/workitem/get_num_groups.39.ll deleted file mode 100644 index fc52fdc..0000000 --- a/amdgcn/lib/workitem/get_num_groups.39.ll +++ /dev/null @@ -1,20 +0,0 @@ -declare i32 @llvm.r600.read.ngroups.x() nounwind readnone -declare i32 @llvm.r600.read.ngroups.y() nounwind readnone -declare i32 @llvm.r600.read.ngroups.z() nounwind readnone - -target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -define i32 @get_num_groups(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.ngroups.x() - ret i32 %x -y_dim: - %y = call i32 @llvm.r600.read.ngroups.y() - ret i32 %y -z_dim: - %z = call i32 @llvm.r600.read.ngroups.z() - ret i32 %z -default: - ret i32 1 -} diff --git a/amdgcn/lib/workitem/get_num_groups.40.ll b/amdgcn/lib/workitem/get_num_groups.40.ll deleted file mode 100644 index 12ec8ea..0000000 --- a/amdgcn/lib/workitem/get_num_groups.40.ll +++ /dev/null @@ -1,23 +0,0 @@ -declare i32 @llvm.r600.read.ngroups.x() nounwind readnone -declare i32 @llvm.r600.read.ngroups.y() nounwind readnone -declare i32 @llvm.r600.read.ngroups.z() nounwind readnone - -target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -define i64 @get_num_groups(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.ngroups.x() - %x.ext = zext i32 %x to i64 - ret i64 %x.ext -y_dim: - %y = call i32 @llvm.r600.read.ngroups.y() - %y.ext = zext i32 %y to i64 - ret i64 %y.ext -z_dim: - %z = call i32 @llvm.r600.read.ngroups.z() - %z.ext = zext i32 %z to i64 - ret i64 %z.ext -default: - ret i64 1 -} diff --git a/amdgcn/lib/workitem/get_num_groups.cl b/amdgcn/lib/workitem/get_num_groups.cl new file mode 100644 index 0000000..f921414 --- /dev/null +++ b/amdgcn/lib/workitem/get_num_groups.cl @@ -0,0 +1,15 @@ +#include + +uint __clc_amdgcn_get_num_groups_x(void) __asm("llvm.r600.read.ngroups.x"); +uint __clc_amdgcn_get_num_groups_y(void) __asm("llvm.r600.read.ngroups.y"); +uint __clc_amdgcn_get_num_groups_z(void) __asm("llvm.r600.read.ngroups.z"); + +_CLC_DEF size_t get_num_groups(uint dim) +{ + switch (dim) { + case 0: return __clc_amdgcn_get_num_groups_x(); + case 1: return __clc_amdgcn_get_num_groups_y(); + case 2: return __clc_amdgcn_get_num_groups_z(); + default: return 1; + } +} diff --git a/amdgcn/lib/workitem/get_num_groups.ll b/amdgcn/lib/workitem/get_num_groups.ll deleted file mode 100644 index a364bec..0000000 --- a/amdgcn/lib/workitem/get_num_groups.ll +++ /dev/null @@ -1,23 +0,0 @@ -declare i32 @llvm.r600.read.ngroups.x() nounwind readnone -declare i32 @llvm.r600.read.ngroups.y() nounwind readnone -declare i32 @llvm.r600.read.ngroups.z() nounwind readnone - -target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5" - -define i64 @get_num_groups(i32 %dim) nounwind readnone alwaysinline { - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label %y_dim i32 2, label %z_dim] -x_dim: - %x = call i32 @llvm.r600.read.ngroups.x() - %x.ext = zext i32 %x to i64 - ret i64 %x.ext -y_dim: - %y = call i32 @llvm.r600.read.ngroups.y() - %y.ext = zext i32 %y to i64 - ret i64 %y.ext -z_dim: - %z = call i32 @llvm.r600.read.ngroups.z() - %z.ext = zext i32 %z to i64 - ret i64 %z.ext -default: - ret i64 1 -} -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:43:04 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:43:04 -0400 Subject: [Libclc-dev] [PATCH 4/5] amdgcn: Move __clc_amdgcn_s_waitcnt definition to clc file In-Reply-To: <20181029064305.12484-1-jan.vesely@rutgers.edu> References: <20181029064305.12484-1-jan.vesely@rutgers.edu> Message-ID: <20181029064305.12484-4-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- amdgcn/lib/SOURCES_3.9 | 1 - amdgcn/lib/SOURCES_4.0 | 1 - amdgcn/lib/mem_fence/fence.cl | 1 + amdgcn/lib/mem_fence/waitcnt.ll | 13 ------------- 4 files changed, 1 insertion(+), 15 deletions(-) delete mode 100644 amdgcn/lib/mem_fence/waitcnt.ll diff --git a/amdgcn/lib/SOURCES_3.9 b/amdgcn/lib/SOURCES_3.9 index 86a222e..c97d406 100644 --- a/amdgcn/lib/SOURCES_3.9 +++ b/amdgcn/lib/SOURCES_3.9 @@ -1,2 +1 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll -mem_fence/waitcnt.ll diff --git a/amdgcn/lib/SOURCES_4.0 b/amdgcn/lib/SOURCES_4.0 index 86a222e..c97d406 100644 --- a/amdgcn/lib/SOURCES_4.0 +++ b/amdgcn/lib/SOURCES_4.0 @@ -1,2 +1 @@ cl_khr_int64_extended_atomics/minmax_helpers.39.ll -mem_fence/waitcnt.ll diff --git a/amdgcn/lib/mem_fence/fence.cl b/amdgcn/lib/mem_fence/fence.cl index 408ffc3..b85baf7 100644 --- a/amdgcn/lib/mem_fence/fence.cl +++ b/amdgcn/lib/mem_fence/fence.cl @@ -14,6 +14,7 @@ void __clc_amdgcn_s_waitcnt(unsigned flags); # define __waitcnt(x) __builtin_amdgcn_s_waitcnt(x) #else # define __waitcnt(x) __clc_amdgcn_s_waitcnt(x) +_CLC_DEF void __clc_amdgcn_s_waitcnt(unsigned) __asm("llvm.amdgcn.s.waitcnt"); #endif _CLC_DEF void mem_fence(cl_mem_fence_flags flags) diff --git a/amdgcn/lib/mem_fence/waitcnt.ll b/amdgcn/lib/mem_fence/waitcnt.ll deleted file mode 100644 index ccf016a..0000000 --- a/amdgcn/lib/mem_fence/waitcnt.ll +++ /dev/null @@ -1,13 +0,0 @@ -declare void @llvm.amdgcn.s.waitcnt(i32) #0 - -target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" - -; Export waitcnt intrinsic for clang < 5 -define void @__clc_amdgcn_s_waitcnt(i32 %flags) #1 { -entry: - tail call void @llvm.amdgcn.s.waitcnt(i32 %flags) - ret void -} - -attributes #0 = { nounwind } -attributes #1 = { nounwind alwaysinline } -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:43:05 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:43:05 -0400 Subject: [Libclc-dev] [PATCH 5/5] amdgcn-amdhsa: Convert get_{global, local}_size to clc for all llvm versions In-Reply-To: <20181029064305.12484-1-jan.vesely@rutgers.edu> References: <20181029064305.12484-1-jan.vesely@rutgers.edu> Message-ID: <20181029064305.12484-5-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- amdgcn-amdhsa/lib/OVERRIDES | 6 --- amdgcn-amdhsa/lib/OVERRIDES_3.9 | 3 -- amdgcn-amdhsa/lib/OVERRIDES_4.0 | 2 - amdgcn-amdhsa/lib/OVERRIDES_5.0 | 2 - amdgcn-amdhsa/lib/SOURCES_3.9 | 2 - amdgcn-amdhsa/lib/SOURCES_4.0 | 2 - amdgcn-amdhsa/lib/SOURCES_5.0 | 2 - .../lib/workitem/get_global_size.39.ll | 36 ----------------- .../lib/workitem/get_global_size.50.ll | 39 ------------------- amdgcn-amdhsa/lib/workitem/get_global_size.cl | 10 ++++- .../lib/workitem/get_local_size.39.ll | 35 ----------------- .../lib/workitem/get_local_size.50.ll | 38 ------------------ amdgcn-amdhsa/lib/workitem/get_local_size.cl | 10 ++++- 13 files changed, 16 insertions(+), 171 deletions(-) delete mode 100644 amdgcn-amdhsa/lib/OVERRIDES delete mode 100644 amdgcn-amdhsa/lib/OVERRIDES_3.9 delete mode 100644 amdgcn-amdhsa/lib/OVERRIDES_4.0 delete mode 100644 amdgcn-amdhsa/lib/OVERRIDES_5.0 delete mode 100644 amdgcn-amdhsa/lib/SOURCES_3.9 delete mode 100644 amdgcn-amdhsa/lib/SOURCES_4.0 delete mode 100644 amdgcn-amdhsa/lib/SOURCES_5.0 delete mode 100644 amdgcn-amdhsa/lib/workitem/get_global_size.39.ll delete mode 100644 amdgcn-amdhsa/lib/workitem/get_global_size.50.ll delete mode 100644 amdgcn-amdhsa/lib/workitem/get_local_size.39.ll delete mode 100644 amdgcn-amdhsa/lib/workitem/get_local_size.50.ll diff --git a/amdgcn-amdhsa/lib/OVERRIDES b/amdgcn-amdhsa/lib/OVERRIDES deleted file mode 100644 index a7a694a..0000000 --- a/amdgcn-amdhsa/lib/OVERRIDES +++ /dev/null @@ -1,6 +0,0 @@ -workitem/get_num_groups.ll -workitem/get_global_size.ll -workitem/get_local_size.ll -workitem/get_num_groups.40.ll -workitem/get_global_size.40.ll -workitem/get_local_size.40.ll diff --git a/amdgcn-amdhsa/lib/OVERRIDES_3.9 b/amdgcn-amdhsa/lib/OVERRIDES_3.9 deleted file mode 100644 index dfe9c8e..0000000 --- a/amdgcn-amdhsa/lib/OVERRIDES_3.9 +++ /dev/null @@ -1,3 +0,0 @@ -workitem/get_global_size.cl -workitem/get_local_size.cl -workitem/get_num_groups.39.ll diff --git a/amdgcn-amdhsa/lib/OVERRIDES_4.0 b/amdgcn-amdhsa/lib/OVERRIDES_4.0 deleted file mode 100644 index ee3a48c..0000000 --- a/amdgcn-amdhsa/lib/OVERRIDES_4.0 +++ /dev/null @@ -1,2 +0,0 @@ -workitem/get_global_size.cl -workitem/get_local_size.cl diff --git a/amdgcn-amdhsa/lib/OVERRIDES_5.0 b/amdgcn-amdhsa/lib/OVERRIDES_5.0 deleted file mode 100644 index ee3a48c..0000000 --- a/amdgcn-amdhsa/lib/OVERRIDES_5.0 +++ /dev/null @@ -1,2 +0,0 @@ -workitem/get_global_size.cl -workitem/get_local_size.cl diff --git a/amdgcn-amdhsa/lib/SOURCES_3.9 b/amdgcn-amdhsa/lib/SOURCES_3.9 deleted file mode 100644 index a6a08af..0000000 --- a/amdgcn-amdhsa/lib/SOURCES_3.9 +++ /dev/null @@ -1,2 +0,0 @@ -workitem/get_global_size.39.ll -workitem/get_local_size.39.ll diff --git a/amdgcn-amdhsa/lib/SOURCES_4.0 b/amdgcn-amdhsa/lib/SOURCES_4.0 deleted file mode 100644 index 2b957ed..0000000 --- a/amdgcn-amdhsa/lib/SOURCES_4.0 +++ /dev/null @@ -1,2 +0,0 @@ -workitem/get_global_size.50.ll -workitem/get_local_size.50.ll diff --git a/amdgcn-amdhsa/lib/SOURCES_5.0 b/amdgcn-amdhsa/lib/SOURCES_5.0 deleted file mode 100644 index 2b957ed..0000000 --- a/amdgcn-amdhsa/lib/SOURCES_5.0 +++ /dev/null @@ -1,2 +0,0 @@ -workitem/get_global_size.50.ll -workitem/get_local_size.50.ll diff --git a/amdgcn-amdhsa/lib/workitem/get_global_size.39.ll b/amdgcn-amdhsa/lib/workitem/get_global_size.39.ll deleted file mode 100644 index b5e7db2..0000000 --- a/amdgcn-amdhsa/lib/workitem/get_global_size.39.ll +++ /dev/null @@ -1,36 +0,0 @@ -declare i8 addrspace(2)* @llvm.amdgcn.dispatch.ptr() #0 - -define i32 @get_global_size(i32 %dim) #1 { - %dispatch_ptr = call noalias nonnull dereferenceable(64) i8 addrspace(2)* @llvm.amdgcn.dispatch.ptr() - switch i32 %dim, label %default [ - i32 0, label %x - i32 1, label %y - i32 2, label %z - ] - -x: - %ptr_x = getelementptr inbounds i8, i8 addrspace(2)* %dispatch_ptr, i32 12 - %ptr_x32 = bitcast i8 addrspace(2)* %ptr_x to i32 addrspace(2)* - %x32 = load i32, i32 addrspace(2)* %ptr_x32, align 4, !invariant.load !0 - ret i32 %x32 - -y: - %ptr_y = getelementptr inbounds i8, i8 addrspace(2)* %dispatch_ptr, i32 16 - %ptr_y32 = bitcast i8 addrspace(2)* %ptr_y to i32 addrspace(2)* - %y32 = load i32, i32 addrspace(2)* %ptr_y32, align 4, !invariant.load !0 - ret i32 %y32 - -z: - %ptr_z = getelementptr inbounds i8, i8 addrspace(2)* %dispatch_ptr, i32 20 - %ptr_z32 = bitcast i8 addrspace(2)* %ptr_z to i32 addrspace(2)* - %z32 = load i32, i32 addrspace(2)* %ptr_z32, align 4, !invariant.load !0 - ret i32 %z32 - -default: - ret i32 1 -} - -attributes #0 = { nounwind readnone } -attributes #1 = { alwaysinline norecurse nounwind readonly } - -!0 = !{} diff --git a/amdgcn-amdhsa/lib/workitem/get_global_size.50.ll b/amdgcn-amdhsa/lib/workitem/get_global_size.50.ll deleted file mode 100644 index af0f2ea..0000000 --- a/amdgcn-amdhsa/lib/workitem/get_global_size.50.ll +++ /dev/null @@ -1,39 +0,0 @@ -declare i8 addrspace(2)* @llvm.amdgcn.dispatch.ptr() #0 - -define i64 @get_global_size(i32 %dim) #1 { - %dispatch_ptr = call noalias nonnull dereferenceable(64) i8 addrspace(2)* @llvm.amdgcn.dispatch.ptr() - switch i32 %dim, label %default [ - i32 0, label %x - i32 1, label %y - i32 2, label %z - ] - -x: - %ptr_x = getelementptr inbounds i8, i8 addrspace(2)* %dispatch_ptr, i64 12 - %ptr_x32 = bitcast i8 addrspace(2)* %ptr_x to i32 addrspace(2)* - %x32 = load i32, i32 addrspace(2)* %ptr_x32, align 4, !invariant.load !0 - %size_x = zext i32 %x32 to i64 - ret i64 %size_x - -y: - %ptr_y = getelementptr inbounds i8, i8 addrspace(2)* %dispatch_ptr, i64 16 - %ptr_y32 = bitcast i8 addrspace(2)* %ptr_y to i32 addrspace(2)* - %y32 = load i32, i32 addrspace(2)* %ptr_y32, align 4, !invariant.load !0 - %size_y = zext i32 %y32 to i64 - ret i64 %size_y - -z: - %ptr_z = getelementptr inbounds i8, i8 addrspace(2)* %dispatch_ptr, i64 20 - %ptr_z32 = bitcast i8 addrspace(2)* %ptr_z to i32 addrspace(2)* - %z32 = load i32, i32 addrspace(2)* %ptr_z32, align 4, !invariant.load !0 - %size_z = zext i32 %z32 to i64 - ret i64 %size_z - -default: - ret i64 1 -} - -attributes #0 = { nounwind readnone } -attributes #1 = { alwaysinline norecurse nounwind readonly } - -!0 = !{} diff --git a/amdgcn-amdhsa/lib/workitem/get_global_size.cl b/amdgcn-amdhsa/lib/workitem/get_global_size.cl index 392cd08..2f95f99 100644 --- a/amdgcn-amdhsa/lib/workitem/get_global_size.cl +++ b/amdgcn-amdhsa/lib/workitem/get_global_size.cl @@ -8,10 +8,16 @@ #define CONST_AS __attribute__((address_space(2))) #endif +#if __clang_major__ >= 6 +#define __dispatch_ptr __builtin_amdgcn_dispatch_ptr +#else +#define __dispatch_ptr __clc_amdgcn_dispatch_ptr +CONST_AS uchar * __clc_amdgcn_dispatch_ptr(void) __asm("llvm.amdgcn.dispatch.ptr"); +#endif + _CLC_DEF size_t get_global_size(uint dim) { - CONST_AS uint * ptr = - (CONST_AS uint *) __builtin_amdgcn_dispatch_ptr(); + CONST_AS uint * ptr = (CONST_AS uint *) __dispatch_ptr(); if (dim < 3) return ptr[3 + dim]; return 1; diff --git a/amdgcn-amdhsa/lib/workitem/get_local_size.39.ll b/amdgcn-amdhsa/lib/workitem/get_local_size.39.ll deleted file mode 100644 index ecb5e8f..0000000 --- a/amdgcn-amdhsa/lib/workitem/get_local_size.39.ll +++ /dev/null @@ -1,35 +0,0 @@ -declare i8 addrspace(2)* @llvm.amdgcn.dispatch.ptr() #0 - -define i32 @get_local_size(i32 %dim) #1 { - %dispatch_ptr = call noalias nonnull dereferenceable(64) i8 addrspace(2)* @llvm.amdgcn.dispatch.ptr() - %dispatch_ptr_i32 = bitcast i8 addrspace(2)* %dispatch_ptr to i32 addrspace(2)* - %xy_size_ptr = getelementptr inbounds i32, i32 addrspace(2)* %dispatch_ptr_i32, i32 1 - %xy_size = load i32, i32 addrspace(2)* %xy_size_ptr, align 4, !invariant.load !0 - switch i32 %dim, label %default [ - i32 0, label %x_dim - i32 1, label %y_dim - i32 2, label %z_dim - ] - -x_dim: - %x_size = and i32 %xy_size, 65535 - ret i32 %x_size - -y_dim: - %y_size = lshr i32 %xy_size, 16 - ret i32 %y_size - -z_dim: - %z_size_ptr = getelementptr inbounds i32, i32 addrspace(2)* %dispatch_ptr_i32, i32 2 - %z_size = load i32, i32 addrspace(2)* %z_size_ptr, align 4, !invariant.load !0, !range !1 - ret i32 %z_size - -default: - ret i32 1 -} - -attributes #0 = { nounwind readnone } -attributes #1 = { alwaysinline norecurse nounwind readonly } - -!0 = !{} -!1 = !{ i32 0, i32 257 } diff --git a/amdgcn-amdhsa/lib/workitem/get_local_size.50.ll b/amdgcn-amdhsa/lib/workitem/get_local_size.50.ll deleted file mode 100644 index ff4b811..0000000 --- a/amdgcn-amdhsa/lib/workitem/get_local_size.50.ll +++ /dev/null @@ -1,38 +0,0 @@ -declare i8 addrspace(2)* @llvm.amdgcn.dispatch.ptr() #0 - -define i64 @get_local_size(i32 %dim) #1 { - %dispatch_ptr = call noalias nonnull dereferenceable(64) i8 addrspace(2)* @llvm.amdgcn.dispatch.ptr() - %dispatch_ptr_i32 = bitcast i8 addrspace(2)* %dispatch_ptr to i32 addrspace(2)* - %xy_size_ptr = getelementptr inbounds i32, i32 addrspace(2)* %dispatch_ptr_i32, i64 1 - %xy_size = load i32, i32 addrspace(2)* %xy_size_ptr, align 4, !invariant.load !0 - switch i32 %dim, label %default [ - i32 0, label %x_dim - i32 1, label %y_dim - i32 2, label %z_dim - ] - -x_dim: - %x_size = and i32 %xy_size, 65535 - %x_size.ext = zext i32 %x_size to i64 - ret i64 %x_size.ext - -y_dim: - %y_size = lshr i32 %xy_size, 16 - %y_size.ext = zext i32 %y_size to i64 - ret i64 %y_size.ext - -z_dim: - %z_size_ptr = getelementptr inbounds i32, i32 addrspace(2)* %dispatch_ptr_i32, i64 2 - %z_size = load i32, i32 addrspace(2)* %z_size_ptr, align 4, !invariant.load !0, !range !1 - %z_size.ext = zext i32 %z_size to i64 - ret i64 %z_size.ext - -default: - ret i64 1 -} - -attributes #0 = { nounwind readnone } -attributes #1 = { alwaysinline norecurse nounwind readonly } - -!0 = !{} -!1 = !{ i32 0, i32 257 } diff --git a/amdgcn-amdhsa/lib/workitem/get_local_size.cl b/amdgcn-amdhsa/lib/workitem/get_local_size.cl index 64d1cf4..9f208d8 100644 --- a/amdgcn-amdhsa/lib/workitem/get_local_size.cl +++ b/amdgcn-amdhsa/lib/workitem/get_local_size.cl @@ -8,10 +8,16 @@ #define CONST_AS __attribute__((address_space(2))) #endif +#if __clang_major__ >= 6 +#define __dispatch_ptr __builtin_amdgcn_dispatch_ptr +#else +#define __dispatch_ptr __clc_amdgcn_dispatch_ptr +CONST_AS char * __clc_amdgcn_dispatch_ptr(void) __asm("llvm.amdgcn.dispatch.ptr"); +#endif + _CLC_DEF size_t get_local_size(uint dim) { - CONST_AS uint * ptr = - (CONST_AS uint *) __builtin_amdgcn_dispatch_ptr(); + CONST_AS uint * ptr = (CONST_AS uint *) __dispatch_ptr(); switch (dim) { case 0: return ptr[1] & 0xffffu; -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:45:13 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:45:13 -0400 Subject: [Libclc-dev] [PATCH 1/3] travis: Check tahiti-amdgcn-mesa-mesa3d.bc Message-ID: <20181029064515.12602-1-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- .travis.yml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/.travis.yml b/.travis.yml index 22688c6..49afa73 100644 --- a/.travis.yml +++ b/.travis.yml @@ -28,7 +28,7 @@ matrix: - LABEL="make gcc LLVM-4.0" - LLVM_VERSION=4.0 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}" - - CHECK_FILES="barts-r600--.bc cayman-r600--.bc cedar-r600--.bc cypress-r600--.bc tahiti-amdgcn--.bc amdgcn--amdhsa.bc nvptx--nvidiacl.bc nvptx64--nvidiacl.bc" + - CHECK_FILES="barts-r600--.bc cayman-r600--.bc cedar-r600--.bc cypress-r600--.bc tahiti-amdgcn--.bc amdgcn--amdhsa.bc tahiti-amdgcn-mesa-mesa3d.bc nvptx--nvidiacl.bc nvptx64--nvidiacl.bc" addons: apt: sources: @@ -43,7 +43,7 @@ matrix: - LABEL="make gcc LLVM-5.0" - LLVM_VERSION=5.0 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}" - - CHECK_FILES="barts-r600--.bc cayman-r600--.bc cedar-r600--.bc cypress-r600--.bc tahiti-amdgcn--.bc amdgcn--amdhsa.bc nvptx--nvidiacl.bc nvptx64--nvidiacl.bc" + - CHECK_FILES="barts-r600--.bc cayman-r600--.bc cedar-r600--.bc cypress-r600--.bc tahiti-amdgcn--.bc amdgcn--amdhsa.bc tahiti-amdgcn-mesa-mesa3d.bc nvptx--nvidiacl.bc nvptx64--nvidiacl.bc" addons: apt: sources: @@ -58,7 +58,7 @@ matrix: - LABEL="make gcc LLVM-6.0" - LLVM_VERSION=6.0 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}" - - CHECK_FILES="barts-r600--.bc cayman-r600--.bc cedar-r600--.bc cypress-r600--.bc tahiti-amdgcn--.bc amdgcn--amdhsa.bc nvptx--nvidiacl.bc nvptx64--nvidiacl.bc" + - CHECK_FILES="barts-r600--.bc cayman-r600--.bc cedar-r600--.bc cypress-r600--.bc tahiti-amdgcn--.bc amdgcn--amdhsa.bc tahiti-amdgcn-mesa-mesa3d.bc nvptx--nvidiacl.bc nvptx64--nvidiacl.bc" # llvm passes -Werror=date-time which is only supported in gcc-4.9+ - MATRIX_EVAL="CC=gcc-4.9 && CXX=g++-4.9" addons: @@ -77,7 +77,7 @@ matrix: - LABEL="make gcc LLVM-7" - LLVM_VERSION=7 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}" - - CHECK_FILES="barts-r600--.bc cayman-r600--.bc cedar-r600--.bc cypress-r600--.bc tahiti-amdgcn--.bc amdgcn--amdhsa.bc nvptx--nvidiacl.bc nvptx64--nvidiacl.bc" + - CHECK_FILES="barts-r600--.bc cayman-r600--.bc cedar-r600--.bc cypress-r600--.bc tahiti-amdgcn--.bc amdgcn--amdhsa.bc tahiti-amdgcn-mesa-mesa3d.bc nvptx--nvidiacl.bc nvptx64--nvidiacl.bc" # llvm passes -Werror=date-time which is only supported in gcc-4.9+ - MATRIX_EVAL="CC=gcc-4.9 && CXX=g++-4.9" addons: -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:45:14 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:45:14 -0400 Subject: [Libclc-dev] [PATCH 2/3] configure: Provide symlink for amdgcn-mesa3d instead of configure hack In-Reply-To: <20181029064515.12602-1-jan.vesely@rutgers.edu> References: <20181029064515.12602-1-jan.vesely@rutgers.edu> Message-ID: <20181029064515.12602-2-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- amdgcn-mesa3d | 1 + configure.py | 2 -- 2 files changed, 1 insertion(+), 2 deletions(-) create mode 120000 amdgcn-mesa3d diff --git a/amdgcn-mesa3d b/amdgcn-mesa3d new file mode 120000 index 0000000..4007828 --- /dev/null +++ b/amdgcn-mesa3d @@ -0,0 +1 @@ +amdgcn-amdhsa \ No newline at end of file diff --git a/configure.py b/configure.py index 55ef1bb..8c021b9 100755 --- a/configure.py +++ b/configure.py @@ -187,8 +187,6 @@ for target in targets: for arch in archs: subdirs.append("%s-%s-%s" % (arch, t_vendor, t_os)) subdirs.append("%s-%s" % (arch, t_os)) - if t_os == 'mesa3d': - subdirs.append('amdgcn-amdhsa') subdirs.append(arch) if arch == 'amdgcn' or arch == 'r600': subdirs.append('amdgpu') -- 2.18.1 From libclc-dev at lists.llvm.org Sun Oct 28 23:45:15 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Mon, 29 Oct 2018 02:45:15 -0400 Subject: [Libclc-dev] [PATCH 3/3] Remove redundant OVERRRIDES file In-Reply-To: <20181029064515.12602-1-jan.vesely@rutgers.edu> References: <20181029064515.12602-1-jan.vesely@rutgers.edu> Message-ID: <20181029064515.12602-3-jan.vesely@rutgers.edu> Signed-off-by: Jan Vesely --- I'm not sure why this was kept around. Anyway it's not necessary after the cleanup. So I'll only push this one after the two clenuup series land. The other two patches in this series are independent. amdgpu/lib/OVERRIDES | 2 -- 1 file changed, 2 deletions(-) delete mode 100644 amdgpu/lib/OVERRIDES diff --git a/amdgpu/lib/OVERRIDES b/amdgpu/lib/OVERRIDES deleted file mode 100644 index 3f941d8..0000000 --- a/amdgpu/lib/OVERRIDES +++ /dev/null @@ -1,2 +0,0 @@ -workitem/get_group_id.cl -workitem/get_global_size.cl -- 2.18.1 From libclc-dev at lists.llvm.org Mon Oct 29 05:58:23 2018 From: libclc-dev at lists.llvm.org (Aaron Watry via Libclc-dev) Date: Mon, 29 Oct 2018 07:58:23 -0500 Subject: [Libclc-dev] [PATCH 1/4] r600: Convert get_local_size to clc In-Reply-To: <20181029064029.12312-1-jan.vesely@rutgers.edu> References: <20181029064029.12312-1-jan.vesely@rutgers.edu> Message-ID: I'm out of town on vacation until tomorrow evening. I'll see when I have time to take a look at these. --Aaron On Mon, Oct 29, 2018, 1:40 AM Jan Vesely via Libclc-dev < libclc-dev at lists.llvm.org> wrote: > Signed-off-by: Jan Vesely > --- > This series consolidates existing llvm asm variants to clc file. > I've verified that the code generated using llvm-5 is the same for > get-locl-size.cl, get-global-size.cl get-num-groups.cl, and > global-memory.cl piglits (and all of them pass on Turks). > > r600/lib/OVERRIDES_3.9 | 1 - > r600/lib/OVERRIDES_4.0 | 1 - > r600/lib/OVERRIDES_5.0 | 1 - > r600/lib/OVERRIDES_6.0 | 1 - > r600/lib/SOURCES | 2 +- > r600/lib/SOURCES_3.9 | 1 - > r600/lib/SOURCES_4.0 | 1 - > r600/lib/SOURCES_5.0 | 1 - > r600/lib/SOURCES_6.0 | 1 - > r600/lib/workitem/get_local_size.39.ll | 20 -------------------- > r600/lib/workitem/get_local_size.cl | 15 +++++++++++++++ > r600/lib/workitem/get_local_size.ll | 20 -------------------- > 12 files changed, 16 insertions(+), 49 deletions(-) > delete mode 100644 r600/lib/workitem/get_local_size.39.ll > create mode 100644 r600/lib/workitem/get_local_size.cl > delete mode 100644 r600/lib/workitem/get_local_size.ll > > diff --git a/r600/lib/OVERRIDES_3.9 b/r600/lib/OVERRIDES_3.9 > index c055c6d..e1a6ae8 100644 > --- a/r600/lib/OVERRIDES_3.9 > +++ b/r600/lib/OVERRIDES_3.9 > @@ -1,4 +1,3 @@ > synchronization/barrier_impl.ll > workitem/get_global_size.ll > -workitem/get_local_size.ll > workitem/get_num_groups.ll > diff --git a/r600/lib/OVERRIDES_4.0 b/r600/lib/OVERRIDES_4.0 > index c055c6d..e1a6ae8 100644 > --- a/r600/lib/OVERRIDES_4.0 > +++ b/r600/lib/OVERRIDES_4.0 > @@ -1,4 +1,3 @@ > synchronization/barrier_impl.ll > workitem/get_global_size.ll > -workitem/get_local_size.ll > workitem/get_num_groups.ll > diff --git a/r600/lib/OVERRIDES_5.0 b/r600/lib/OVERRIDES_5.0 > index c055c6d..e1a6ae8 100644 > --- a/r600/lib/OVERRIDES_5.0 > +++ b/r600/lib/OVERRIDES_5.0 > @@ -1,4 +1,3 @@ > synchronization/barrier_impl.ll > workitem/get_global_size.ll > -workitem/get_local_size.ll > workitem/get_num_groups.ll > diff --git a/r600/lib/OVERRIDES_6.0 b/r600/lib/OVERRIDES_6.0 > index c055c6d..e1a6ae8 100644 > --- a/r600/lib/OVERRIDES_6.0 > +++ b/r600/lib/OVERRIDES_6.0 > @@ -1,4 +1,3 @@ > synchronization/barrier_impl.ll > workitem/get_global_size.ll > -workitem/get_local_size.ll > workitem/get_num_groups.ll > diff --git a/r600/lib/SOURCES b/r600/lib/SOURCES > index e69be4a..75cf901 100644 > --- a/r600/lib/SOURCES > +++ b/r600/lib/SOURCES > @@ -5,6 +5,6 @@ workitem/get_global_offset.cl > workitem/get_group_id.cl > workitem/get_global_size.ll > workitem/get_local_id.cl > -workitem/get_local_size.ll > +workitem/get_local_size.cl > workitem/get_num_groups.ll > workitem/get_work_dim.cl > diff --git a/r600/lib/SOURCES_3.9 b/r600/lib/SOURCES_3.9 > index ba09398..9f36052 100644 > --- a/r600/lib/SOURCES_3.9 > +++ b/r600/lib/SOURCES_3.9 > @@ -15,5 +15,4 @@ image/write_imageui.cl > image/write_image_impl.ll > synchronization/barrier_impl.39.ll > workitem/get_global_size.39.ll > -workitem/get_local_size.39.ll > workitem/get_num_groups.39.ll > diff --git a/r600/lib/SOURCES_4.0 b/r600/lib/SOURCES_4.0 > index 091990c..6ca2332 100644 > --- a/r600/lib/SOURCES_4.0 > +++ b/r600/lib/SOURCES_4.0 > @@ -1,4 +1,3 @@ > synchronization/barrier_impl.39.ll > workitem/get_global_size.39.ll > -workitem/get_local_size.39.ll > workitem/get_num_groups.39.ll > diff --git a/r600/lib/SOURCES_5.0 b/r600/lib/SOURCES_5.0 > index 091990c..6ca2332 100644 > --- a/r600/lib/SOURCES_5.0 > +++ b/r600/lib/SOURCES_5.0 > @@ -1,4 +1,3 @@ > synchronization/barrier_impl.39.ll > workitem/get_global_size.39.ll > -workitem/get_local_size.39.ll > workitem/get_num_groups.39.ll > diff --git a/r600/lib/SOURCES_6.0 b/r600/lib/SOURCES_6.0 > index 091990c..6ca2332 100644 > --- a/r600/lib/SOURCES_6.0 > +++ b/r600/lib/SOURCES_6.0 > @@ -1,4 +1,3 @@ > synchronization/barrier_impl.39.ll > workitem/get_global_size.39.ll > -workitem/get_local_size.39.ll > workitem/get_num_groups.39.ll > diff --git a/r600/lib/workitem/get_local_size.39.ll > b/r600/lib/workitem/get_local_size.39.ll > deleted file mode 100644 > index c9f2c84..0000000 > --- a/r600/lib/workitem/get_local_size.39.ll > +++ /dev/null > @@ -1,20 +0,0 @@ > -declare i32 @llvm.r600.read.local.size.x() nounwind readnone > -declare i32 @llvm.r600.read.local.size.y() nounwind readnone > -declare i32 @llvm.r600.read.local.size.z() nounwind readnone > - > -target datalayout = > "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" > - > -define i32 @get_local_size(i32 %dim) nounwind readnone alwaysinline { > - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label > %y_dim i32 2, label %z_dim] > -x_dim: > - %x = call i32 @llvm.r600.read.local.size.x() > - ret i32 %x > -y_dim: > - %y = call i32 @llvm.r600.read.local.size.y() > - ret i32 %y > -z_dim: > - %z = call i32 @llvm.r600.read.local.size.z() > - ret i32 %z > -default: > - ret i32 1 > -} > diff --git a/r600/lib/workitem/get_local_size.cl b/r600/lib/workitem/ > get_local_size.cl > new file mode 100644 > index 0000000..89e2612 > --- /dev/null > +++ b/r600/lib/workitem/get_local_size.cl > @@ -0,0 +1,15 @@ > +#include > + > +uint __clc_r600_get_local_size_x(void) > __asm("llvm.r600.read.local.size.x"); > +uint __clc_r600_get_local_size_y(void) > __asm("llvm.r600.read.local.size.y"); > +uint __clc_r600_get_local_size_z(void) > __asm("llvm.r600.read.local.size.z"); > + > +_CLC_DEF size_t get_local_size(uint dim) > +{ > + switch (dim) { > + case 0: return __clc_r600_get_local_size_x(); > + case 1: return __clc_r600_get_local_size_y(); > + case 2: return __clc_r600_get_local_size_z(); > + default: return 1; > + } > +} > diff --git a/r600/lib/workitem/get_local_size.ll > b/r600/lib/workitem/get_local_size.ll > deleted file mode 100644 > index 04ce076..0000000 > --- a/r600/lib/workitem/get_local_size.ll > +++ /dev/null > @@ -1,20 +0,0 @@ > -declare i32 @llvm.r600.read.local.size.x() nounwind readnone > -declare i32 @llvm.r600.read.local.size.y() nounwind readnone > -declare i32 @llvm.r600.read.local.size.z() nounwind readnone > - > -target datalayout = > "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5" > - > -define i32 @get_local_size(i32 %dim) nounwind readnone alwaysinline { > - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label > %y_dim i32 2, label %z_dim] > -x_dim: > - %x = call i32 @llvm.r600.read.local.size.x() > - ret i32 %x > -y_dim: > - %y = call i32 @llvm.r600.read.local.size.y() > - ret i32 %y > -z_dim: > - %z = call i32 @llvm.r600.read.local.size.z() > - ret i32 %z > -default: > - ret i32 1 > -} > -- > 2.18.1 > > _______________________________________________ > Libclc-dev mailing list > Libclc-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From libclc-dev at lists.llvm.org Wed Oct 31 09:13:01 2018 From: libclc-dev at lists.llvm.org (Jan Vesely via Libclc-dev) Date: Wed, 31 Oct 2018 12:13:01 -0400 Subject: [Libclc-dev] [PATCH 1/4] r600: Convert get_local_size to clc In-Reply-To: References: <20181029064029.12312-1-jan.vesely@rutgers.edu> Message-ID: <3da0c99baf4cd956976b5da2b4fdeafa67afd2e9.camel@rutgers.edu> On Mon, 2018-10-29 at 07:58 -0500, Aaron Watry via Libclc-dev wrote: > I'm out of town on vacation until tomorrow evening. I'll see when I have > time to take a look at these. thanks. These are the first cleanups to make transitioning to a new build system easier. The changes are now included in my semi-regular piglit runs [0,1,2], and there is also an appveyor job that provides built libraries as artifacts [3] Jan [0] https://jvesely.github.io/piglit/radeon-latest-5/problems.html [1] https://jvesely.github.io/piglit/gcn-latest-3/problems.html [2] https://jvesely.github.io/piglit/raven-latest-5/problems.html [3] https://ci.appveyor.com/project/jvesely/libclc > > --Aaron > > On Mon, Oct 29, 2018, 1:40 AM Jan Vesely via Libclc-dev < > libclc-dev at lists.llvm.org> wrote: > > > Signed-off-by: Jan Vesely > > --- > > This series consolidates existing llvm asm variants to clc file. > > I've verified that the code generated using llvm-5 is the same for > > get-locl-size.cl, get-global-size.cl get-num-groups.cl, and > > global-memory.cl piglits (and all of them pass on Turks). > > > > r600/lib/OVERRIDES_3.9 | 1 - > > r600/lib/OVERRIDES_4.0 | 1 - > > r600/lib/OVERRIDES_5.0 | 1 - > > r600/lib/OVERRIDES_6.0 | 1 - > > r600/lib/SOURCES | 2 +- > > r600/lib/SOURCES_3.9 | 1 - > > r600/lib/SOURCES_4.0 | 1 - > > r600/lib/SOURCES_5.0 | 1 - > > r600/lib/SOURCES_6.0 | 1 - > > r600/lib/workitem/get_local_size.39.ll | 20 -------------------- > > r600/lib/workitem/get_local_size.cl | 15 +++++++++++++++ > > r600/lib/workitem/get_local_size.ll | 20 -------------------- > > 12 files changed, 16 insertions(+), 49 deletions(-) > > delete mode 100644 r600/lib/workitem/get_local_size.39.ll > > create mode 100644 r600/lib/workitem/get_local_size.cl > > delete mode 100644 r600/lib/workitem/get_local_size.ll > > > > diff --git a/r600/lib/OVERRIDES_3.9 b/r600/lib/OVERRIDES_3.9 > > index c055c6d..e1a6ae8 100644 > > --- a/r600/lib/OVERRIDES_3.9 > > +++ b/r600/lib/OVERRIDES_3.9 > > @@ -1,4 +1,3 @@ > > synchronization/barrier_impl.ll > > workitem/get_global_size.ll > > -workitem/get_local_size.ll > > workitem/get_num_groups.ll > > diff --git a/r600/lib/OVERRIDES_4.0 b/r600/lib/OVERRIDES_4.0 > > index c055c6d..e1a6ae8 100644 > > --- a/r600/lib/OVERRIDES_4.0 > > +++ b/r600/lib/OVERRIDES_4.0 > > @@ -1,4 +1,3 @@ > > synchronization/barrier_impl.ll > > workitem/get_global_size.ll > > -workitem/get_local_size.ll > > workitem/get_num_groups.ll > > diff --git a/r600/lib/OVERRIDES_5.0 b/r600/lib/OVERRIDES_5.0 > > index c055c6d..e1a6ae8 100644 > > --- a/r600/lib/OVERRIDES_5.0 > > +++ b/r600/lib/OVERRIDES_5.0 > > @@ -1,4 +1,3 @@ > > synchronization/barrier_impl.ll > > workitem/get_global_size.ll > > -workitem/get_local_size.ll > > workitem/get_num_groups.ll > > diff --git a/r600/lib/OVERRIDES_6.0 b/r600/lib/OVERRIDES_6.0 > > index c055c6d..e1a6ae8 100644 > > --- a/r600/lib/OVERRIDES_6.0 > > +++ b/r600/lib/OVERRIDES_6.0 > > @@ -1,4 +1,3 @@ > > synchronization/barrier_impl.ll > > workitem/get_global_size.ll > > -workitem/get_local_size.ll > > workitem/get_num_groups.ll > > diff --git a/r600/lib/SOURCES b/r600/lib/SOURCES > > index e69be4a..75cf901 100644 > > --- a/r600/lib/SOURCES > > +++ b/r600/lib/SOURCES > > @@ -5,6 +5,6 @@ workitem/get_global_offset.cl > > workitem/get_group_id.cl > > workitem/get_global_size.ll > > workitem/get_local_id.cl > > -workitem/get_local_size.ll > > +workitem/get_local_size.cl > > workitem/get_num_groups.ll > > workitem/get_work_dim.cl > > diff --git a/r600/lib/SOURCES_3.9 b/r600/lib/SOURCES_3.9 > > index ba09398..9f36052 100644 > > --- a/r600/lib/SOURCES_3.9 > > +++ b/r600/lib/SOURCES_3.9 > > @@ -15,5 +15,4 @@ image/write_imageui.cl > > image/write_image_impl.ll > > synchronization/barrier_impl.39.ll > > workitem/get_global_size.39.ll > > -workitem/get_local_size.39.ll > > workitem/get_num_groups.39.ll > > diff --git a/r600/lib/SOURCES_4.0 b/r600/lib/SOURCES_4.0 > > index 091990c..6ca2332 100644 > > --- a/r600/lib/SOURCES_4.0 > > +++ b/r600/lib/SOURCES_4.0 > > @@ -1,4 +1,3 @@ > > synchronization/barrier_impl.39.ll > > workitem/get_global_size.39.ll > > -workitem/get_local_size.39.ll > > workitem/get_num_groups.39.ll > > diff --git a/r600/lib/SOURCES_5.0 b/r600/lib/SOURCES_5.0 > > index 091990c..6ca2332 100644 > > --- a/r600/lib/SOURCES_5.0 > > +++ b/r600/lib/SOURCES_5.0 > > @@ -1,4 +1,3 @@ > > synchronization/barrier_impl.39.ll > > workitem/get_global_size.39.ll > > -workitem/get_local_size.39.ll > > workitem/get_num_groups.39.ll > > diff --git a/r600/lib/SOURCES_6.0 b/r600/lib/SOURCES_6.0 > > index 091990c..6ca2332 100644 > > --- a/r600/lib/SOURCES_6.0 > > +++ b/r600/lib/SOURCES_6.0 > > @@ -1,4 +1,3 @@ > > synchronization/barrier_impl.39.ll > > workitem/get_global_size.39.ll > > -workitem/get_local_size.39.ll > > workitem/get_num_groups.39.ll > > diff --git a/r600/lib/workitem/get_local_size.39.ll > > b/r600/lib/workitem/get_local_size.39.ll > > deleted file mode 100644 > > index c9f2c84..0000000 > > --- a/r600/lib/workitem/get_local_size.39.ll > > +++ /dev/null > > @@ -1,20 +0,0 @@ > > -declare i32 @llvm.r600.read.local.size.x() nounwind readnone > > -declare i32 @llvm.r600.read.local.size.y() nounwind readnone > > -declare i32 @llvm.r600.read.local.size.z() nounwind readnone > > - > > -target datalayout = > > "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" > > - > > -define i32 @get_local_size(i32 %dim) nounwind readnone alwaysinline { > > - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label > > %y_dim i32 2, label %z_dim] > > -x_dim: > > - %x = call i32 @llvm.r600.read.local.size.x() > > - ret i32 %x > > -y_dim: > > - %y = call i32 @llvm.r600.read.local.size.y() > > - ret i32 %y > > -z_dim: > > - %z = call i32 @llvm.r600.read.local.size.z() > > - ret i32 %z > > -default: > > - ret i32 1 > > -} > > diff --git a/r600/lib/workitem/get_local_size.cl b/r600/lib/workitem/ > > get_local_size.cl > > new file mode 100644 > > index 0000000..89e2612 > > --- /dev/null > > +++ b/r600/lib/workitem/get_local_size.cl > > @@ -0,0 +1,15 @@ > > +#include > > + > > +uint __clc_r600_get_local_size_x(void) > > __asm("llvm.r600.read.local.size.x"); > > +uint __clc_r600_get_local_size_y(void) > > __asm("llvm.r600.read.local.size.y"); > > +uint __clc_r600_get_local_size_z(void) > > __asm("llvm.r600.read.local.size.z"); > > + > > +_CLC_DEF size_t get_local_size(uint dim) > > +{ > > + switch (dim) { > > + case 0: return __clc_r600_get_local_size_x(); > > + case 1: return __clc_r600_get_local_size_y(); > > + case 2: return __clc_r600_get_local_size_z(); > > + default: return 1; > > + } > > +} > > diff --git a/r600/lib/workitem/get_local_size.ll > > b/r600/lib/workitem/get_local_size.ll > > deleted file mode 100644 > > index 04ce076..0000000 > > --- a/r600/lib/workitem/get_local_size.ll > > +++ /dev/null > > @@ -1,20 +0,0 @@ > > -declare i32 @llvm.r600.read.local.size.x() nounwind readnone > > -declare i32 @llvm.r600.read.local.size.y() nounwind readnone > > -declare i32 @llvm.r600.read.local.size.z() nounwind readnone > > - > > -target datalayout = > > "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5" > > - > > -define i32 @get_local_size(i32 %dim) nounwind readnone alwaysinline { > > - switch i32 %dim, label %default [i32 0, label %x_dim i32 1, label > > %y_dim i32 2, label %z_dim] > > -x_dim: > > - %x = call i32 @llvm.r600.read.local.size.x() > > - ret i32 %x > > -y_dim: > > - %y = call i32 @llvm.r600.read.local.size.y() > > - ret i32 %y > > -z_dim: > > - %z = call i32 @llvm.r600.read.local.size.z() > > - ret i32 %z > > -default: > > - ret i32 1 > > -} > > -- > > 2.18.1 > > > > _______________________________________________ > > Libclc-dev mailing list > > Libclc-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev > > > > _______________________________________________ > Libclc-dev mailing list > Libclc-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part URL: